<?xml version="1.0" encoding="UTF-8"?>
<rss  xmlns:atom="http://www.w3.org/2005/Atom" 
      xmlns:media="http://search.yahoo.com/mrss/" 
      xmlns:content="http://purl.org/rss/1.0/modules/content/" 
      xmlns:dc="http://purl.org/dc/elements/1.1/" 
      version="2.0">
<channel>
<title>Sarem Seitz</title>
<link>https://www.sarem-seitz.com/</link>
<atom:link href="https://www.sarem-seitz.com/index.xml" rel="self" type="application/rss+xml"/>
<description></description>
<generator>quarto-1.5.57</generator>
<lastBuildDate>Tue, 01 Oct 2024 00:00:00 GMT</lastBuildDate>
<item>
  <title>Hosting a static Blog on Bare-Metals Kubernetes - This could have been a GitHub pages site…</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/hosting-a-static-blog-on-kubernetes.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>In today’s tech landscape, Kubernetes has become synonymous with scalable and resilient application hosting. But what happens when you combine it with the relatively humble task of hosting a static blog? Well, I went down that rabbit hole and set up this blog on a bare-metal Kubernetes cluster. To be exact, this is the second time I have done this. After switching back and forth between various content management systems for blogs, I have decided it was finally time to move back to a static <a href="https://quarto.org/">quarto</a> site, hosted on good old K8s.</p>
<p>Could I have done this with GitHub Pages? Sure. Did I want to take the long (and more complicated) route? Absolutely.</p>
<p>Let me walk you through my experience setting up a Kubernetes-powered blog, and how it all came together.</p>
</section>
<section id="design-goals" class="level2">
<h2 class="anchored" data-anchor-id="design-goals">Design Goals</h2>
<p>From the outset, I had some design goals for this project:</p>
<ol type="1">
<li><strong>Cost-effectiveness</strong>: I wanted the whole setup to be cheap. No managed Kubernetes, no fancy cloud providers — just affordable VMs running the underlying Kubernetes nodes.</li>
<li><strong>Continuous deployment</strong>: Any push to the blog’s master branch in GitHub should automatically deploy the latest blog content.</li>
<li><strong>Notebooks integration</strong>: There also exists <a href="https://github.com/SaremS/sample_notebooks">another repository</a> where I store rather raw, uncommented experiments in Jupyter notebooks, too. Despite them being mostly code-only, I found that they might still be interesting to read, so they should be included here as well. Thus, each deployment of the main blog should also copy and include all notebooks from this second repository.</li>
<li><strong>Simplicity</strong>: Although the underlying infrastructure is rather complex, deploying and updating the blog should be rather simple. This means, ideally, no crazy CRD extensions but only out-of-the-box Kubernetes resources where possible.</li>
</ol>
</section>
<section id="kubernetes-cluster-setup" class="level2">
<h2 class="anchored" data-anchor-id="kubernetes-cluster-setup">Kubernetes Cluster Setup</h2>
<section id="node-setup" class="level3">
<h3 class="anchored" data-anchor-id="node-setup">Node Setup</h3>
<p>Since managed Kubernetes was out of the question, I opted for a bare-metal Kubernetes setup with VMs. I rented three virtual machines from <a href="https://contabo.com/">Contabo</a>, a fairly budget-friendly hosting provider (no affiliation on my end). While they are not the most reliable VM provider and spinning up a new VM takes quite some time, they are cheap and things are working nicely most of the time.</p>
<p>With three machine as cluster nodes, I created one master and two worker nodes. This obviously doesn’t ensure high availability at all but keep in mind that we are still talking about a personal blog and not a critical production environment.</p>
<p>To automate some initial configuration, I created an <a href="https://www.ansible.com/">Ansible</a> playbook. The primary advantage here is being able to quickly tear down and rebuild the entire setup if necessary. As managing all necessary firewall configurations between the VMs turned out to be quite tedious, I also used <a href="https://www.wireguard.com/">WireGuard</a> to set up a virtual network overlay. On the one hand, this adds some additional communication overhead. On the other hand, this only left the WireGuard port to worry about regarding communication amongst the cluster nodes.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/hosting-a-static-blog-on-kubernetes/wireguard-ansible.png" class="img-fluid figure-img"></p>
<figcaption>Some excerpt from the Wireguard Ansible playbook. Someday, I’ll hopefully have a playbook to completely automate the cluster setup altogether…</figcaption>
</figure>
</div>
</section>
<section id="cluster-setup" class="level3">
<h3 class="anchored" data-anchor-id="cluster-setup">Cluster Setup</h3>
<p>For the Kubernetes cluster, I initially tried a raw <em>kubeadm</em> setup, but configuring <a href="https://cilium.io/">Cilium</a> as the CNI turned out to be much harder than expected. Accepting my defeat (for now), I went with Rancher’s <a href="https://docs.rke2.io/">RKE2</a> and things went relatively smooth from there. Installing <a href="https://argo-cd.readthedocs.io/en/stable/">ArgoCD</a> then completed the, more or less, manual parts of the set-up. Every other installation is now managed by Argo.</p>
<p>Although persistent storage is not necessary at this point, I did some experiments that required setting up PVCs. As this is already an RKE2 cluster, it made sense to use <a href="https://longhorn.io/">Longhorn</a> which was also reasonably straightforward to install.</p>
<p>Finally, to enable external traffic, I added <a href="https://metallb.universe.tf/">metallb</a> to the Kubernetes cluster and installed <a href="https://nginx.org/en/">nginx</a> to the control plane VM. Nginx then accepts external traffic and performs a proxy pass to the exposed metallb services.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/hosting-a-static-blog-on-kubernetes/nginx-metallb.svg" class="img-fluid figure-img"></p>
<figcaption>Nginx acts as a reverse proxy and passes external traffic to the corresponding metallb service</figcaption>
</figure>
</div>
<p>You can find all cluster-wide ArgoCD installations <a href="https://github.com/SaremS/kubernetes-deployments/tree/master/cluster">here</a>.</p>
</section>
</section>
<section id="hosting-the-blog-on-kubernetes" class="level2">
<h2 class="anchored" data-anchor-id="hosting-the-blog-on-kubernetes">Hosting the Blog on Kubernetes</h2>
<p>Now that the cluster was up and running, it was time to host the actual blog. Remember that an important goal is to keep the manual effort for deploying new edits at a minimum. <a href="https://quarto.org/docs/publishing/github-pages.html#github-action">As described by the quarto docs</a>, the probably easiest way to host and deploy quarto sites is via <a href="https://pages.github.com/">GitHub</a>. A simple GitHub Actions pipeline would be sufficient here to auto-deploy any updates to the underlying notebooks.</p>
<p>Since I want to keep open the option to add non-static functionality later on, this approach is unfortunately out of question. Rather, the implied automatism needs to be transferred to the more complex Kubernetes setup. At least I wanted to avoid having to manually update some image tag in the ArgoCD deployment, whenever I edited the blog.</p>
<p>First, I was playing around with a small customization of the <a href="https://github.com/kubernetes/git-sync">git-sync image</a>. The idea was to let the git-sync container poll for new commit in the <a href="https://github.com/SaremS/blog">blog repo</a> and then trigger a Kubernetes Job to re-render the quarto files. The advantage here was that the repository containing the blog files only had to contain the notebooks and the quarto yaml files. In essence, the blog would have been more or less completely decoupled from any infrastructure or deployment assumptions. There was also still the issue of incorporating the notebooks from the other repository.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/hosting-a-static-blog-on-kubernetes/go-job.png" class="img-fluid figure-img"></p>
<figcaption>A Go binary to trigger a Kubernetes job after a successful git-sync update - too complex after all</figcaption>
</figure>
</div>
<p>At some point, I realized that this is unnecessarily complex, so I decided to go with a single container that ultimately exposes a Caddy static file server. You can find the whole build process in the <a href="https://github.com/SaremS/blog/blob/master/.github/workflows/main.yml">GitHub-Actions configuration</a> and the corresponding <a href="https://github.com/SaremS/blog/blob/master/Dockerfile">Dockerfile</a>.</p>
<p>Now, whenever the blog is updated, a new blog image is built and tagged with the current datetime. Inside the build process, we add the notebooks from the second repository and slightly modifies the quarto configuration to include those additional files. Vice versa, whenever the second repository is updated, that repo <a href="https://github.com/SaremS/sample_notebooks/blob/master/.github/workflows/main.yml">remotely triggers</a> the blog’s build process, too. This has been working nicely, without any issues so far.</p>
<p>To avoid manual updates to the image tags, I have also added the <a href="https://argocd-image-updater.readthedocs.io/en/stable/">ArgoCD Image Updater</a>. By simply adding the annotations,</p>
<pre><code>annotations:
    argocd-image-updater.argoproj.io/image-list: blog=ghcr.io/sarems/blog 
    argocd-image-updater.argoproj.io/blog.update-strategy: alphabetical</code></pre>
<p>to the <a href="https://github.com/SaremS/kubernetes-deployments/blob/master/blog/blog-argocd-app-helm.yaml">ArgoCD application yaml</a>, the Image Updater automatically polls the <code>ghcr.io</code> image repository for updates.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/hosting-a-static-blog-on-kubernetes/image-updater-annotation.png" class="img-fluid figure-img"></p>
<figcaption>Only two annotation to automate the image updates</figcaption>
</figure>
</div>
<p>If it finds a new image, the Argo application is automatically updated to use the newer image. The only caveat is that ArgoCD deployment either needs to be done on a Helm chart or a Kustomization. Here, I went with a rather simple <a href="https://github.com/SaremS/kubernetes-deployments/tree/master/blog/quarto-blog">Helm chart</a>.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/hosting-a-static-blog-on-kubernetes/argo-applications.png" class="img-fluid figure-img"></p>
<figcaption>Everything’s finally green</figcaption>
</figure>
</div>
</section>
<section id="conclusion-and-takeaways" class="level2">
<h2 class="anchored" data-anchor-id="conclusion-and-takeaways">Conclusion and takeaways</h2>
<p>Could I have hosted this blog on GitHub Pages? Absolutely. But where’s the fun in that? At the very least, I have improved a bit on setting up Bare Metals clusters from scratch. I still see failing to set it up with raw kubeadm in time as a personal weakness, though…</p>
<p>Nevertheless, the whole setup has been running perfectly stable as of now. Keep in mind though that this is a rather small, personal blog with low traffic. Time will tell if this would keep up with larger traffic, but I am reasonably optimistic.</p>
<p>What is definitely cool with this approach is being able to host other applications on the cluster and then incorporating them in rendered notebooks. I have successfully tested this idea in an old version of this blog. Thus, I’ll hopefully soon find time to showcase more applied things on this blog rather than just writing about some theoretical model ideas.</p>
<p>If you have any questions or ideas for improvements, please feel free to write me an email. I’ll absolutely try to answer, but time is quite sparse these days.</p>


</section>

 ]]></description>
  <category>Kubernetes</category>
  <guid>https://www.sarem-seitz.com/posts/hosting-a-static-blog-on-kubernetes.html</guid>
  <pubDate>Tue, 01 Oct 2024 00:00:00 GMT</pubDate>
  <media:content url="https://www.sarem-seitz.com/images/hosting-a-static-blog-on-kubernetes/argo-applications.png" medium="image" type="image/png" height="47" width="144"/>
</item>
<item>
  <title>With PyTorch, I can Gradient Boost anything</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p><a href="https://en.wikipedia.org/wiki/Gradient_boosting">Gradient Boosting</a> is a powerful machine learning technique that can be used for both regression and classification problems. Its fundamental idea is to combine weak, almost trivial base model into a single strong ensemble. A fairly unique feature of Gradient Boosting is the loss minimization by calculating derivatives with respect to the model output, rather than model parameters.</p>
<p>This allows for the distinct capability to use non-differentiable models as a base learner, typically decision trees. In fact, there is probably not a single popular Gradient Boosting library that does not rely on trees. This allows Boosting models to handle complex, non-smooth data generating processes fairly well. As a result, popular implementations like <a href="https://xgboost.readthedocs.io/en/latest/">XGBoost</a> and <a href="https://lightgbm.readthedocs.io/en/latest/">LightGBM</a> regularly dominate Kaggle competitions and the like.</p>
<p>PyTorch on the other hand is rather well known for programming Deep Learning type models. Thus, the idea to use PyTorch for Gradient Boosting might seem a bit odd at first. However, its capabilities are not limited to just Neural Networks but it is actually a much richer framework for anything <a href="https://en.wikipedia.org/wiki/Automatic_differentiation">AutoDiff</a>. Whatever function we can express in its domain language, PyTorch will find the corresponding derivatives.</p>
<p>Hence, it should be interesting to let PyTorch find the deriviatives for increasingly complex loss functions. Then, we can apply the standard Gradinent Boosting algorithm to fit simple Decision Trees to those derivatives. This way, we can hopefully create models that are more flexible than standard Gradient Boosting implementations, but still as powerful.</p>
</section>
<section id="a-quick-recap-on-gradient-boosting" class="level2">
<h2 class="anchored" data-anchor-id="a-quick-recap-on-gradient-boosting">A quick recap on Gradient Boosting</h2>
<p>As already mentioned, in Gradient Boosting we are interested in the derivatives of the loss function with respect to the model output:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Cfrac%7B%5Cpartial%20L(%5Chat%7BF%7D(x),y)%7D%7B%5Cpartial%20%5Chat%7BF%7D(x)%7D"></p>
<p>Notice that, here, we don’t use the chain rule to extend the above to the model parameters. Rather, we stick to the above and perform gradient descent in, roughly speaking, function space. At each gradient descent step, we fit our base learner to the negative gradients with respect to the previous round’s model output:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7Bf%7D_%7Bm+1%7D%20=%20%5Cunderset%7B%3Cparameter%3E%7D%7B%5Ctext%7Bargmin%7D%7D%5Cfrac%7B1%7D%7BN%7D%5Csum_%7Bi=1%7D%5EN%0A%5Cleft(%5Cfrac%7B%5Cpartial%20L(%5Chat%7BF%7D_%7Bm-1%7D(x_i),y_i)%7D%7B%5Cpartial%20%5Chat%7BF%7D_%7Bm-1%7D(x_i)%7D-%20f(x_i)%5Cright)%5E2"></p>
<p>Here, <img src="https://latex.codecogs.com/png.latex?%5Chat%7BF%7D_%7Bm-1%7D"> is the ensemble of the first <img src="https://latex.codecogs.com/png.latex?m-1"> base learners, denoted as <img src="https://latex.codecogs.com/png.latex?%5Chat%7Bf%7D_j">. Notice also that the <a href="https://en.wikipedia.org/wiki/Mean_squared_error">MSE</a> criterion is not necessarily the only one possible. For simplicity, we will stick to it in this post, though.</p>
<section id="word-of-caution" class="level3">
<h3 class="anchored" data-anchor-id="word-of-caution">Word of caution</h3>
<p>As an important side-note, consider the following: If we use the squared loss, i.e.</p>
<p><img src="https://latex.codecogs.com/png.latex?L(%5Chat%7BF%7D(x),y)%20=%200.5%5Cleft(y-%5Chat%7BF%7D(x)%5Cright)%5E2,"></p>
<p>the derivative with respect to the model output is simply the residual:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Cfrac%7B%5Cpartial%20L(%5Chat%7BF%7D(x),y)%7D%7B%5Cpartial%20%5Chat%7Bf%7D(x)%7D%20=%20y-%5Chat%7BF%7D(x)"></p>
<p>If you first learn about Gradient Boosting in the context of regression, you might now be tempted to think that your base learner should always estimate the residual. Given the above, however, this is obviously not the case.</p>
</section>
<section id="updating-the-ensemble" class="level3">
<h3 class="anchored" data-anchor-id="updating-the-ensemble">Updating the ensemble</h3>
<p>Next, we want to update our ensemble by adding the new base learner. As already mentioned, we do this by performing what looks like a gradient descent step in function space:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7BF%7D_m(x)%20=%20%5Chat%7BF%7D_%7Bm-1%7D(x)%20-%20%5Cgamma%20%5Chat%7Bf%7D_m(x)"></p>
<p>Here, <img src="https://latex.codecogs.com/png.latex?%5Cgamma"> is the learning rate. In practice, we usually use a small value like 0.01 or 0.001.</p>
</section>
<section id="initializing-the-ensemble" class="level3">
<h3 class="anchored" data-anchor-id="initializing-the-ensemble">Initializing the ensemble</h3>
<p>The final question that remains is how to initialize the ensemble. Since our goal is to minimize the loss function, we should use a model that acts accordingly. This initial model is typically just a constant value, set as the minimum of the loss function for all training inputs:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7Bf%7D_0(x)%20=%20%5Cargmin_%7Bc%7D%20%5Cfrac%7B1%7D%7BN%7D%5Csum_%7Bi=1%7D%5EN%20L(c,y_i)"></p>
<p>For the MSE loss, <a href="https://math.stackexchange.com/questions/2554243/understanding-the-mean-minimizes-the-mean-squared-error">this is simply the mean of the target values</a>. Keep in mind, again, that this is usually not the case for other loss functions.</p>
<p>Also, you will notice that in the below examples, I won’t always use the loss minimizing constant(s) to start the algorithm. While this might yield slightly worse results, it leads to less verbose code. If you want to replicate the results, you might therefore want to use the acatual loss minimizing constant(s).</p>
</section>
<section id="the-algorithm-in-summary" class="level3">
<h3 class="anchored" data-anchor-id="the-algorithm-in-summary">The algorithm in summary</h3>
<p>To summarize, the algorithm for Gradient Boosting is as follows:</p>
<ol type="1">
<li>Initialize the ensemble with a constant value</li>
<li>For each round <img src="https://latex.codecogs.com/png.latex?m">:
<ol type="1">
<li>Calculate the negative gradients with respect to the previous round’s model output</li>
<li>Fit a base learner to the negative gradients</li>
<li>Update the ensemble by performing a gradient descent step in function space</li>
</ol></li>
</ol>
</section>
</section>
<section id="gradient-boosting-for-conditional-probability-distributions" class="level2">
<h2 class="anchored" data-anchor-id="gradient-boosting-for-conditional-probability-distributions">Gradient Boosting for conditional probability distributions</h2>
<p>If you come from a Statistics background, you might be saddened by the lack of libraries for probabilistic Gradient Boosting. Despite the countless probability distributions that could well suit your data, you are usually limited to Bernoulli/Multinomial distributions via cross-entropy losses.</p>
<p>Thus, without further ado, let’s see how PyTorch can solve this problem for us. While we will implement the algorithm for a Gaussian distribution, it should be fairly easy to extend it to other distributions. The most characteristic feature here is that we need a Boosting model for each of the distribution’s parameters. In the Gaussian case, this means two models for the mean and standard deviation, respectively. Let us exemplify this by writing our full model as a stacked vector of two models:</p>
<p><img src="https://latex.codecogs.com/png.latex?F(x)=%5Cbegin%7Bbmatrix%7DF%5E%7B(1)%7D(x)%20%5C%5C%20F%5E%7B(2)%7D(x)%5Cend%7Bbmatrix%7D"></p>
<p>In addition, we also need to ensure that the parameters are valid, i.e.&nbsp;positive for the standard deviation. Thus, we also have to map the output of the corresponding Gradient Boosting model to the positive, non-zero reals. This can be done, for example, via the <img src="https://latex.codecogs.com/png.latex?%5Cexp">-function.</p>
<p>For a single input <img src="https://latex.codecogs.com/png.latex?x">, the full probabilistic description is now the following:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(y%7Cx)=%5Cmathcal%7BN%7D(y%7CF%5E%7B(1)%7D(x),%20%5Cexp%7B(F%5E%7B(2)%7D(x))%7D)"></p>
<p>As usual for probabilistic models, we will use the <a href="https://en.wikipedia.org/wiki/Likelihood_function#Log-likelihood">negative log-likelihood</a> as our loss function:</p>
<p><img src="https://latex.codecogs.com/png.latex?L(F(x),y)=-%5Clog%20p(y%7Cx)%20=%20-%20%5Clog%5Cmathcal%7BN%7D(y%7CF%5E%7B(1)%7D(x),%20%5Cexp%7B(F%5E%7B(2)%7D(x))%7D)"></p>
<p>To initialize the model, we can use the <a href="https://www.statlect.com/fundamentals-of-statistics/normal-distribution-maximum-likelihood#:~:text=Proof-,The%20maximum%20likelihood%20estimators,-The%20maximum%20likelihood">maximum likelihood estimates</a> for the mean and standard deviation:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7BF%7D%5E%7B(1)%7D_0(x)%20=%20%5Cfrac%7B1%7D%7BN%7D%5Csum_%7Bi=1%7D%5EN%20y_i"></p>
<p>and</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7BF%7D%5E%7B(2)%7D_0(x)%20=%20%5Csqrt%7B%5Cfrac%7B1%7D%7BN%7D%5Csum_%7Bi=1%7D%5EN%20(y_i-%5Chat%7BF%7D%5E%7B(1)%7D(x))%5E2%7D."></p>
<p>Finally, we calculate the logarithm of <img src="https://latex.codecogs.com/png.latex?%5Chat%7BF%7D%5E%7B(2)%7D_0(x)"> to account for the exponentiation in the loss function. Keep in mind that the latter is not necessarily the optimtal initialization. However, it is a good starting point for our experiments.</p>
<p>We can now code our model as follows:</p>
<div id="1b916bc3-3a76-4ce5-8d05-85c9de551758" class="cell" data-execution_count="35">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.tree <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> DecisionTreeRegressor</span>
<span id="cb1-6"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.linear_model <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> LinearRegression</span>
<span id="cb1-7"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.datasets <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> fetch_california_housing</span>
<span id="cb1-8"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.model_selection <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> train_test_split</span>
<span id="cb1-9"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.ensemble <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> GradientBoostingRegressor</span>
<span id="cb1-10"></span>
<span id="cb1-11"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> torch</span>
<span id="cb1-12"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> torch.distributions.normal <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Normal</span>
<span id="cb1-13"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> torch.autograd <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Variable</span>
<span id="cb1-14"></span>
<span id="cb1-15"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> List, Optional</span>
<span id="cb1-16"></span>
<span id="cb1-17"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> GaussianGradientBoosting:</span>
<span id="cb1-18"></span>
<span id="cb1-19">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>,</span>
<span id="cb1-20">                 learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>,</span>
<span id="cb1-21">                 max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb1-22">                 n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>):</span>
<span id="cb1-23"></span>
<span id="cb1-24">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> learning_rate</span>
<span id="cb1-25">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> max_depth</span>
<span id="cb1-26">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_estimators</span>
<span id="cb1-27"></span>
<span id="cb1-28">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_mu: Optional[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span></span>
<span id="cb1-29">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mu_trees: List[DecisionTreeRegressor] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-30">        </span>
<span id="cb1-31">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_sigma: Optional[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span></span>
<span id="cb1-32">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma_trees: List[DecisionTreeRegressor] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-33">        </span>
<span id="cb1-34">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb1-35"></span>
<span id="cb1-36">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-37">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb1-38"></span>
<span id="cb1-39">        mus <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_mus(X).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-40">        log_sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.exp(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_log_sigmas(X).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb1-41"></span>
<span id="cb1-42">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> np.concatenate([mus, log_sigmas], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb1-43"></span>
<span id="cb1-44">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_raw(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-45">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb1-46"></span>
<span id="cb1-47">        mus <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_mus(X).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-48">        log_sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_log_sigmas(X).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-49"></span>
<span id="cb1-50">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> np.concatenate([mus, log_sigmas], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb1-51">    </span>
<span id="cb1-52"></span>
<span id="cb1-53">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> fit(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-54">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._fit_initial(y)</span>
<span id="cb1-55"></span>
<span id="cb1-56">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb1-57"></span>
<span id="cb1-58">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators):</span>
<span id="cb1-59">            y_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_raw(X)</span>
<span id="cb1-60"></span>
<span id="cb1-61">            gradients <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._get_gradients(y, y_pred)</span>
<span id="cb1-62"></span>
<span id="cb1-63">            mu_tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth)</span>
<span id="cb1-64">            mu_tree.fit(X, gradients[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>])</span>
<span id="cb1-65">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mu_trees.append(mu_tree)</span>
<span id="cb1-66"></span>
<span id="cb1-67">            sigma_tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth)</span>
<span id="cb1-68">            sigma_tree.fit(X, gradients[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb1-69">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma_trees.append(sigma_tree)</span>
<span id="cb1-70"></span>
<span id="cb1-71"></span>
<span id="cb1-72">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _fit_initial(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-73">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb1-74">        </span>
<span id="cb1-75">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_mu <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(y)</span>
<span id="cb1-76">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(np.std(y))</span>
<span id="cb1-77"></span>
<span id="cb1-78"></span>
<span id="cb1-79">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _get_gradients(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y: np.array, y_pred: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-80">        y_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb1-81">        y_pred_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Variable(torch.tensor(y_pred).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(), requires_grad<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb1-82"></span>
<span id="cb1-83">        normal_dist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Normal(y_pred_torch[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], torch.exp(y_pred_torch[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])).log_prob(y_torch).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>()</span>
<span id="cb1-84">        normal_dist.backward()</span>
<span id="cb1-85"></span>
<span id="cb1-86">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> y_pred_torch.grad.numpy()</span>
<span id="cb1-87"></span>
<span id="cb1-88"></span>
<span id="cb1-89">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_mus(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-90">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X))</span>
<span id="cb1-91"></span>
<span id="cb1-92">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_mu</span>
<span id="cb1-93"></span>
<span id="cb1-94">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mu_trees:</span>
<span id="cb1-95">            output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> tree.predict(X)</span>
<span id="cb1-96"></span>
<span id="cb1-97">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> output</span>
<span id="cb1-98"></span>
<span id="cb1-99">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_log_sigmas(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-100">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X))</span>
<span id="cb1-101"></span>
<span id="cb1-102">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_sigma</span>
<span id="cb1-103"></span>
<span id="cb1-104">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma_trees:</span>
<span id="cb1-105">            output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> tree.predict(X)</span>
<span id="cb1-106"></span>
<span id="cb1-107">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> output</span>
<span id="cb1-108">        </span>
<span id="cb1-109">        </span>
<span id="cb1-110">        </span></code></pre></div>
</div>
<p>Let’s take look at some important code fragments. In particular, the steps in <code>fit</code> are interesting to us:</p>
<p>At first, <code>self._fit_initial(y)</code> initializes the model via the mean and log-standard deviation as described above. Next, we iterate over the number of rounds, as defined by the <code>n_estimators</code> variable. The <code>_predict_raw</code> method calculates the models’ outputs at <img src="https://latex.codecogs.com/png.latex?m-1">, then we pass the results to <code>_get_gradients</code>. In the latter, we then use PyTorch to differentiate the log-likelihood loss with respect to each output. Lucky for us, PyTorch already has the required log-likelihood function available via <code>log_prob</code>:</p>
<div class="sourceCode" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1">y_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb2-2">y_pred_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Variable(torch.tensor(y_pred).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(), requires_grad<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb2-3"></span>
<span id="cb2-4">normal_dist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Normal(y_pred_torch[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], torch.exp(y_pred_torch[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])).log_prob(y_torch).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>()</span>
<span id="cb2-5">normal_dist.backward()</span>
<span id="cb2-6"></span>
<span id="cb2-7"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> y_pred_torch.grad.numpy()</span></code></pre></div>
<p>Notice that PyTorch requires a single scalar value as the final loss. Hence we need to sum the log-likelihoods for each input. It is easy to check that the respective derivatives are still as expected:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Cbegin%7Bgathered%7D%0A%5Cfrac%7B%5Cpartial%20%5Csum_%7Bi=1%7D%5EN%20L(F(x_i),y_i)%7D%7B%5Cpartial%20F%5E%7B(m)%7D(x_T)%7D%20%5C%5C%0A=%5Csum_%7Bi=1%7D%5EN%20%5Cfrac%7B%5Cpartial%20L(F(x_i),y_i)%7D%7B%5Cpartial%20F%5E%7B(m)%7D(x_t)%7D%20%5C%5C%0A=%200%20+%200%20+%20%5Ccdots%20+%20%5Cfrac%7B%5Cpartial%20L(F(x_t),y_t)%7D%7B%5Cpartial%20F%5E%7B(m)%7D(x_t)%7D%20+%200%20+%20%5Ccdots%20+%200%0A%5Cend%7Bgathered%7D%0A"></p>
<p>where <img src="https://latex.codecogs.com/png.latex?t"> denotes the index of the input <img src="https://latex.codecogs.com/png.latex?x_t"> for which we want to calculate the derivative.</p>
<p>In a last step, we fit two <code>sklearn.tree.DecisionTreeRegressor</code> to the negative gradients. Since <code>log_prob</code> returns the positive (=“negative negative”) log-likelihood, the respective gradients will already have the correct signs.</p>
<p>Now, let us test our model on a simple toy problem. We define the data generating process as follows:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap(x)%20=%20%5Cmathcal%7BU%7D(x%7C-3,3)%20%5C%5C%0Ap(y%7Cx)%20=%20%5Cmathcal%7BN%7D(y%7Csin(x),%200.05%20+%200.1%20*%20x%5E2)%0A%5Cend%7Bgathered%7D%0A"></p>
<p><img src="https://latex.codecogs.com/png.latex?x"> is uniformly distributed between -3 and 3, while <img src="https://latex.codecogs.com/png.latex?y"> is normally distributed with a mean of <img src="https://latex.codecogs.com/png.latex?sin(x)"> and a standard deviation of <img src="https://latex.codecogs.com/png.latex?0.05%20+%200.1%20*%20x%5E2">. This makes the data heteroscedastic, i.e.&nbsp;the variance depends on the input <img src="https://latex.codecogs.com/png.latex?x">. As useful model should be able to capture both the conditional mean and standard deviation.</p>
<div id="8c0dc39f" class="cell" data-execution_count="40">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb3-2"></span>
<span id="cb3-3">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.uniform(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb3-4">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(X).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> np.random.normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb3-5"></span>
<span id="cb3-6">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb3-7">plt.scatter(X, y, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<div id="780a71d4" class="cell" data-execution_count="41">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> GaussianGradientBoosting(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>)</span>
<span id="cb4-2"></span>
<span id="cb4-3">model.fit(X, y)</span>
<span id="cb4-4"></span>
<span id="cb4-5">line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-6"></span>
<span id="cb4-7">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.predict(line)</span>
<span id="cb4-8">mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> predictions[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb4-9">std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> predictions[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb4-10"></span>
<span id="cb4-11">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb4-12">plt.plot(line, np.sin(line), c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'r'</span>)</span>
<span id="cb4-13">plt.plot(line, mean, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'b'</span>)</span>
<span id="cb4-14">plt.fill_between(line.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>std, mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>std, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'b'</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>)</span>
<span id="cb4-15">plt.scatter(X, y, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This looks pretty good! We can see that the model is able to capture both the conditional mean and standard deviation. In particular, the model is able to capture the increasing variance for larger absolute <img src="https://latex.codecogs.com/png.latex?x">.</p>
<p>From here, it would be relatively easy to extend the model to other distributions. We could, for example experiment with Gamma or Log-Normal distributions to handle target values that can only be positive. More advanced models could also incorporate <a href="https://en.wikipedia.org/wiki/Censoring_(statistics)">censoring</a> or <a href="https://en.wikipedia.org/wiki/Truncated_distribution">truncation</a>. As long as the respective log-likelihood loss is differentiable with respect to the model output, we will be able to optimize our model.</p>
</section>
<section id="varying-coefficient-boosting" class="level2">
<h2 class="anchored" data-anchor-id="varying-coefficient-boosting">Varying Coefficient Boosting</h2>
<p>A fairly common problem of many sophisticated machine learning models is that they are often a black-box. While predictive power can be much better than with simpler models, it is often hard to interpret the results. Hence, a model could make biased, unfair or physically wrong predictions without us even noticing. As a result, there is a growing interest in models that are more interpretable.</p>
<p>One such model is <a href="https://en.wikipedia.org/wiki/Varying-coefficient_model">Varying Coefficient Regression</a>. Here, we assume that the coefficients of a linear regression model are not constant but rather depend on the input. In the context of Gradient Boosting, this could look as follows:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7By%7D%20=%20F%5E%7B(0)%7D(x)%20+%20F%5E%7B(1)%7D(x)%5Ccdot%20x%5E%7B(1)%7D%20+%20%5Ccdots%20F%5E%7B(M)%7D(x)%5Ccdot%20x%5E%7B(M)%7D"></p>
<p>Here, we have <img src="https://latex.codecogs.com/png.latex?M+1"> Boosting models, one for each coefficient in a linear regression model. In a sufficiently small neighborhood around <img src="https://latex.codecogs.com/png.latex?x">, the model can then be interpreted as a linear regression model with coefficients defined by the boosting model. Once we move outside of this neighborhood, the coefficients change accordingly.</p>
<p>Applying the squared error criterion to our Varying Coefficient model above, we now get</p>
<p><img src="https://latex.codecogs.com/png.latex?L(F(x),y)=0.5%20%5Cleft(y%20-%20%5Cleft(F%5E%7B(0)%7D(x)%20+%20F%5E%7B(1)%7D(x)%5Ccdot%20x%5E%7B(1)%7D%20+%20%5Ccdots%20F%5E%7B(M)%7D(x)%5Ccdot%20x%5E%7B(M)%7D%5Cright)%5Cright)%5E2"></p>
<p>Again, we can let PyTorch differentiate the above with respect to each Gradient Boosting output. To initialize the model, we could, for example, start with the (constant) OLS regression coefficient. This would be the “correct” approach, given that we typically want to start with the constant minimizer of the mean loss function over all training points. Here, however, I decided to set <img src="https://latex.codecogs.com/png.latex?F_0%5E%7B(0)%7D(x)=%5Cfrac%7B1%7D%7BN%7D%5Csum_%7Bi=1%7D%5EN%20y_i"> and the remaining coefficients to zero.</p>
<p>Putting all of this together, we get the following model in Python:</p>
<div id="acc5f75e" class="cell" data-execution_count="42">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> VaryingCoefficientGradientBoosting:</span>
<span id="cb5-2"></span>
<span id="cb5-3">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>,</span>
<span id="cb5-4">                 learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>,</span>
<span id="cb5-5">                 max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb5-6">                 n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>):</span>
<span id="cb5-7"></span>
<span id="cb5-8">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> learning_rate</span>
<span id="cb5-9">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> max_depth</span>
<span id="cb5-10">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_estimators</span>
<span id="cb5-11"></span>
<span id="cb5-12">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs: Optional[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span></span>
<span id="cb5-13">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.coeff_trees: List[List[DecisionTreeRegressor]] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb5-14">        </span>
<span id="cb5-15">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb5-16"></span>
<span id="cb5-17">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@property</span></span>
<span id="cb5-18">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> n_coefficients(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>:</span>
<span id="cb5-19">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained:</span>
<span id="cb5-20">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb5-21">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb5-22">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span> </span>
<span id="cb5-23"></span>
<span id="cb5-24">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb5-25">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb5-26"></span>
<span id="cb5-27">        X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb5-28"></span>
<span id="cb5-29">        coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_coeffs(X)</span>
<span id="cb5-30">        predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> coeffs, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb5-31"></span>
<span id="cb5-32">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> predictions</span>
<span id="cb5-33"></span>
<span id="cb5-34">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_raw(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb5-35">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb5-36">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_coeffs(X)</span>
<span id="cb5-37">    </span>
<span id="cb5-38"></span>
<span id="cb5-39">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> fit(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb5-40">        X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb5-41">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._fit_initial(X, y)</span>
<span id="cb5-42"></span>
<span id="cb5-43">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb5-44"></span>
<span id="cb5-45">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators):</span>
<span id="cb5-46">            y_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_raw(X)</span>
<span id="cb5-47"></span>
<span id="cb5-48">            gradients <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._get_gradients(X, y, y_pred)</span>
<span id="cb5-49"></span>
<span id="cb5-50">            new_trees <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb5-51"></span>
<span id="cb5-52">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_coefficients):</span>
<span id="cb5-53">                coeff_tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth)</span>
<span id="cb5-54">                coeff_tree.fit(X, gradients[:,c])</span>
<span id="cb5-55"></span>
<span id="cb5-56">                new_trees.append(coeff_tree)</span>
<span id="cb5-57"></span>
<span id="cb5-58">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.coeff_trees.append(new_trees)</span>
<span id="cb5-59"></span>
<span id="cb5-60"></span>
<span id="cb5-61"></span>
<span id="cb5-62">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _fit_initial(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb5-63">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb5-64"></span>
<span id="cb5-65">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, X.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]))</span>
<span id="cb5-66">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(y)</span>
<span id="cb5-67"></span>
<span id="cb5-68"></span>
<span id="cb5-69">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _get_gradients(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array, y_pred: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb5-70">        X_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb5-71">        y_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb5-72">        y_pred_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Variable(torch.tensor(y_pred).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(), requires_grad<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb5-73"></span>
<span id="cb5-74">        sse <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (y_torch<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>(X_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> y_pred_torch).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">pow</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span>).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>() <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#negative sse to get negative gradient</span></span>
<span id="cb5-75">        sse.backward() </span>
<span id="cb5-76">        grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_pred_torch.grad.numpy()</span>
<span id="cb5-77"></span>
<span id="cb5-78">        grads[grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)</span>
<span id="cb5-79">        grads[grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span>
<span id="cb5-80"></span>
<span id="cb5-81">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> grads</span>
<span id="cb5-82"></span>
<span id="cb5-83"></span>
<span id="cb5-84">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_coeffs(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb5-85">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_coefficients))</span>
<span id="cb5-86"></span>
<span id="cb5-87">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs</span>
<span id="cb5-88"></span>
<span id="cb5-89">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree_list <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.coeff_trees:</span>
<span id="cb5-90">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_coefficients):</span>
<span id="cb5-91">                output[:,c] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> tree_list[c].predict(X)</span>
<span id="cb5-92"></span>
<span id="cb5-93">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> output</span></code></pre></div>
</div>
<p>This closely resembles the code for the probabilistic model from before. A key difference, however, is the following:</p>
<div class="sourceCode" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1">X_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb6-2">y_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb6-3">y_pred_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Variable(torch.tensor(y_pred).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(), requires_grad<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb6-4"></span>
<span id="cb6-5">mse <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (y_torch<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>(X_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> y_pred_torch).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">pow</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span>).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>()</span>
<span id="cb6-6">mse.backward() </span>
<span id="cb6-7">grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_pred_torch.grad.numpy()</span>
<span id="cb6-8"></span>
<span id="cb6-9">grads[grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)</span>
<span id="cb6-10">grads[grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span></code></pre></div>
<p>Here, we clip the gradients at the 5% and 95% quantiles. The reason for that is solely an empirical one. Keeping the raw gradients led to some fairly unstable results, i.e.&nbsp;the model loss would usually diverge to infinity. It might be interesting to see the theoretical reason for this, but for now, we will stick with the application side.</p>
<p>Notice also that we introduce the intercept term by prepending a column of ones to the input matrix:</p>
<div class="sourceCode" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> fit(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb7-2">    X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-3">    ...</span></code></pre></div>
<div class="sourceCode" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb8-2">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb8-3"></span>
<span id="cb8-4">    X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-5">    ...</span></code></pre></div>
<p>This allows us to treat the respective intercept Boosting model like the other ones. Subsequently, some calculations become a bit simpler.</p>
<p>Now, let us test the model on a simple toy problem. We define the data generating process as before, but keep the variance constant:</p>
<div id="147c53f7" class="cell" data-execution_count="62">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb9-2"></span>
<span id="cb9-3">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.uniform(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb9-4">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(X).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> np.random.normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>) </span>
<span id="cb9-5">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb9-6">plt.scatter(X, y, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<div id="4035ab05" class="cell" data-execution_count="63">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientGradientBoosting(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>, learning_rate<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>)</span>
<span id="cb10-2"></span>
<span id="cb10-3">model.fit(X, y)</span>
<span id="cb10-4"></span>
<span id="cb10-5">line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb10-6"></span>
<span id="cb10-7">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.predict(line)</span>
<span id="cb10-8"></span>
<span id="cb10-9">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb10-10">plt.plot(line, np.sin(line), c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'r'</span>)</span>
<span id="cb10-11">plt.plot(line, predictions, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'b'</span>)</span>
<span id="cb10-12">plt.scatter(X, y, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-7-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Again, the results look reasonable. As our new model has a neat interpretability feature, let us also run the model on an actual dataset. Here, I chose the California Housing dataset from <a href="https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html">scikit-learn</a>. The dataset contains information about housing prices in California in the 1990s. The goal is to predict the median house value in a given block based on the remaining features.</p>
<p>To make things more interesting, we will also compare the results against standard Gradient Boosting from sklearn. Keep in mind, though, that the evaluation process is really simple and should not be considered a proper benchmark.</p>
<div id="d3f39446" class="cell" data-execution_count="64">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1">housing <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fetch_california_housing()</span>
<span id="cb11-2">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> housing.data</span>
<span id="cb11-3"></span>
<span id="cb11-4">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> housing.target</span>
<span id="cb11-5">X_train, X_test, y_train, y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_test_split(X, y, test_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.33</span>, random_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb11-6"></span>
<span id="cb11-7">X_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(X_train, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb11-8">X_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(X_train, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb11-9"></span>
<span id="cb11-10">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_std</span>
<span id="cb11-11">X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_std</span>
<span id="cb11-12"></span>
<span id="cb11-13">y_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(y_train)</span>
<span id="cb11-14">y_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(y_train)</span>
<span id="cb11-15"></span>
<span id="cb11-16">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> y_std</span>
<span id="cb11-17">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> y_std</span>
<span id="cb11-18"></span></code></pre></div>
</div>
<div id="0a1b6684" class="cell" data-execution_count="65">
<div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb12-2"></span>
<span id="cb12-3">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientGradientBoosting(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb12-4">model.fit(X_train, y_train)</span>
<span id="cb12-5"></span>
<span id="cb12-6">predictions_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.predict(X_test)</span>
<span id="cb12-7"></span>
<span id="cb12-8">rmse_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb12-9"></span>
<span id="cb12-10"></span>
<span id="cb12-11">gb_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> GradientBoostingRegressor(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb12-12">gb_model.fit(X_train, y_train)</span>
<span id="cb12-13"></span>
<span id="cb12-14">predictions_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gb_model.predict(X_test)</span>
<span id="cb12-15"></span>
<span id="cb12-16">rmse_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb12-17"></span>
<span id="cb12-18"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for varying coefficient model: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_varcoeff<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb12-19"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for gradient boosting model: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_gb<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>RMSE for varying coefficient model: 0.4724723297379694
RMSE for gradient boosting model: 0.4893399365654721</code></pre>
</div>
</div>
<p>This looks great. For the given hyperparameters and data, our approach is able to keep up with standard Gradient Boosting. Now, let us take a look at the coefficients of our model for a given input:</p>
<div id="71a44114" class="cell" data-execution_count="66">
<div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb14-1">X_eval <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X_test[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)], axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb14-2"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#need to prepend the intercept column manually, since _predict_coeffs() expects it</span></span>
<span id="cb14-3"></span>
<span id="cb14-4">importances <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.Series(model._predict_coeffs(X_eval).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), index<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Intercept"</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>housing.feature_names)</span>
<span id="cb14-5">importances.sort_values().plot(kind<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'bar'</span>, figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-10-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Besides the intercept, House Age and Median incomde appear to be the most important features for the given input. For another input observation, we get the following:</p>
<div id="29deed6f" class="cell" data-execution_count="67">
<div class="sourceCode cell-code" id="cb15" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb15-1">X_eval <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X_test[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)], axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb15-2"></span>
<span id="cb15-3">importances <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.Series(model._predict_coeffs(X_eval).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), index<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Intercept"</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>housing.feature_names)</span>
<span id="cb15-4">importances.sort_values().plot(kind<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'bar'</span>, figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-11-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Here, the intercept term has much more relevance. Also, latitude and longitude are now the most influential features.</p>
<p>It might be debatable whether the intercept term is actually sensible. Technically, removing all other coefficients and only keeping the intercept would result in standard Gradient Boosting. As the latter is already sufficiently powerful in itself, why use varying coefficients at all? On the other hand, removing the intercept entirely would result in a degenerate model for an all-zero input vector.</p>
<p>The only possible prediction without an intercept term would then be zero as well, which is not what we want either. Another solution would be to model the intercept term as a constant, not via a Gradient Boosting model. As this would make the code more verbose, I decided to not do it here.</p>
</section>
<section id="combining-deep-learning-with-boosted-trees" class="level2">
<h2 class="anchored" data-anchor-id="combining-deep-learning-with-boosted-trees">Combining Deep Learning with Boosted Trees</h2>
<p>If you are confident with your Calculus skills, our usage of PyTorch so far might seem rather unneccessary. After all, we could have just calculated the derivatives by hand, without requiring a full-blown framework like PyTorch. While this would require more manual work, it would also be more efficient computation-wise. For that reason, we will now look at an example, where manual differentiation is not a viable option if you want to maintain a healthy social life.</p>
<p>The idea is fairly simple: Use a Convolutional Neural Network but replace the elements of the last weight matrix with Gradient Boosting varying coefficients. As this could easily blow up model complexity (consider one Boosting model for each element in a <img src="https://latex.codecogs.com/png.latex?10%5Ctimes10"> matrix), we only work with a rather simple example. In fact, we’ll reduce the MNIST dataset to only 0s and 1s to get a binary classification problem. Also, we let the penultimate layer only have two neurons, to reduce the final weight matrix to size <img src="https://latex.codecogs.com/png.latex?2%5Ctimes1"></p>
<p>The rest of the model is just a standard CNN with some arbitrary hyperparameters:</p>
<div id="38211e06" class="cell" data-execution_count="68">
<div class="sourceCode cell-code" id="cb16" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb16-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> ConvolutionalNetWithBoostedTrees:</span>
<span id="cb16-2"></span>
<span id="cb16-3">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>,</span>
<span id="cb16-4">                 learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>,</span>
<span id="cb16-5">                 max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb16-6">                 n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, </span>
<span id="cb16-7">                 n_booster: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>):</span>
<span id="cb16-8">        </span>
<span id="cb16-9">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.Conv2d(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>, kernel_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, stride<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, padding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb16-10">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.Conv2d(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, kernel_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, stride<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, padding<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb16-11">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb16-12">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.Linear(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, n_booster)</span>
<span id="cb16-13"></span>
<span id="cb16-14">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> learning_rate</span>
<span id="cb16-15">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> max_depth</span>
<span id="cb16-16">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_estimators</span>
<span id="cb16-17"></span>
<span id="cb16-18">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.boosted_trees: List[List[DecisionTreeRegressor]] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb16-19"></span>
<span id="cb16-20">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs: np.array <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, n_booster))</span>
<span id="cb16-21"></span>
<span id="cb16-22">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb16-23">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_booster <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_booster</span>
<span id="cb16-24"></span>
<span id="cb16-25"></span>
<span id="cb16-26">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array):</span>
<span id="cb16-27">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb16-28"></span>
<span id="cb16-29">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1(x))</span>
<span id="cb16-30">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.max_pool2d(x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb16-31">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2(x))</span>
<span id="cb16-32">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.max_pool2d(x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb16-33">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> x.view(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>)</span>
<span id="cb16-34">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1(x))</span>
<span id="cb16-35">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2(x).detach().numpy()</span>
<span id="cb16-36"></span>
<span id="cb16-37">        coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_coeffs(X.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>))</span>
<span id="cb16-38"></span>
<span id="cb16-39">        logits <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> coeffs, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb16-40"></span>
<span id="cb16-41">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#sigmoid</span></span>
<span id="cb16-42">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> np.exp(logits) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> np.exp(logits))</span>
<span id="cb16-43">    </span>
<span id="cb16-44"></span>
<span id="cb16-45">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> fit(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array, batch_size: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>):</span>
<span id="cb16-46">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb16-47"></span>
<span id="cb16-48">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators):</span>
<span id="cb16-49">            idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.choice(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), batch_size, replace<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>)</span>
<span id="cb16-50">            X_batch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X[idx]</span>
<span id="cb16-51">            y_batch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[idx]</span>
<span id="cb16-52">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._fit_single_step(X_batch, y_batch)</span>
<span id="cb16-53"></span>
<span id="cb16-54"></span>
<span id="cb16-55">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _fit_single_step(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array):</span>
<span id="cb16-56"></span>
<span id="cb16-57">        X_2d <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>)</span>
<span id="cb16-58"></span>
<span id="cb16-59">        coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_coeffs(X_2d)</span>
<span id="cb16-60">        coeffs_var <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Variable(torch.tensor(coeffs).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(), requires_grad<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb16-61"></span>
<span id="cb16-62">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb16-63">        y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb16-64"></span>
<span id="cb16-65">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1(x))</span>
<span id="cb16-66">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.max_pool2d(x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb16-67">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2(x))</span>
<span id="cb16-68">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.max_pool2d(x, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb16-69">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> x.view(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">32</span>)</span>
<span id="cb16-70">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1(x))</span>
<span id="cb16-71">        x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2(x)</span>
<span id="cb16-72"></span>
<span id="cb16-73">        logit <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> coeffs_var).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb16-74"></span>
<span id="cb16-75">        loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.nn.functional.binary_cross_entropy_with_logits(logit, y)</span>
<span id="cb16-76"></span>
<span id="cb16-77">        loss.backward()</span>
<span id="cb16-78"></span>
<span id="cb16-79">        coeff_grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> coeffs_var.grad.numpy()</span>
<span id="cb16-80">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._update_booster(X_2d, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>coeff_grads)</span>
<span id="cb16-81"></span>
<span id="cb16-82">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._update_network()</span>
<span id="cb16-83"></span>
<span id="cb16-84">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _update_network(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb16-85">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1.weight.grad</span>
<span id="cb16-86">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2.weight.grad</span>
<span id="cb16-87">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1.weight.grad</span>
<span id="cb16-88">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2.weight.grad</span>
<span id="cb16-89"></span>
<span id="cb16-90">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1.weight.grad.zero_()</span>
<span id="cb16-91">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2.weight.grad.zero_()</span>
<span id="cb16-92">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1.weight.grad.zero_()</span>
<span id="cb16-93">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2.weight.grad.zero_()</span>
<span id="cb16-94"></span>
<span id="cb16-95"></span>
<span id="cb16-96">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _update_booster(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb16-97">        tree_list <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb16-98">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_booster):</span>
<span id="cb16-99">            tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth, min_samples_leaf<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>)</span>
<span id="cb16-100">            tree.fit(X, y[:,c])</span>
<span id="cb16-101">            tree_list.append(tree)</span>
<span id="cb16-102">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.boosted_trees.append(tree_list)</span>
<span id="cb16-103"></span>
<span id="cb16-104"></span>
<span id="cb16-105">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_coeffs(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb16-106">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X), <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_booster))</span>
<span id="cb16-107"></span>
<span id="cb16-108">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs</span>
<span id="cb16-109"></span>
<span id="cb16-110">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree_list <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.boosted_trees:</span>
<span id="cb16-111">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_booster):</span>
<span id="cb16-112">                output[:,c] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> tree_list[c].predict(X)</span>
<span id="cb16-113"></span>
<span id="cb16-114">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> output</span></code></pre></div>
</div>
<p>Since we do all calculations except gradient by hand, our model class does not inherit from <code>torch.nn.Module</code> here. Also, the <code>fit</code> methods are now slightly different from before, which is due to the additional parameters that we need to learn. For each layer that doesn’t use Gradient Boosting, we need to manually implement the gradient steps. Think of this like fitting depth-0 Decision Trees (constant predictions) to a constant target for any input.</p>
<p>I.e., our model consists of a parameter vector, that is comprised of many constant and two varying parameters. The latter are the ones we want to model with Boosted Trees:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Ctheta(x)=%5Cbegin%7Bpmatrix%7D%5Ctheta%5E%7B(1)%7D%20%5C%5C%20%5Cvdots%20%5C%5C%20%5Ctheta%5E%7B(K)%7D%20%5C%5C%20%5Ctheta%5E%7B(K+1)%7D(x)%20%5C%5C%20%5Ctheta%5E%7B(K+2)%7D(x)%5Cend%7Bpmatrix%7D"></p>
<p>At each gradient descent step, the constant parameters are updated as usual, while the varying parameters are updated via Gradient Boosting:</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Ctheta_m(x)=%5Cbegin%7Bpmatrix%7D%0A%5Ctheta_%7Bm%7D%5E%7B(1)%7D%20%5C%5C%20%5Cvdots%20%5C%5C%20%5Ctheta_%7Bm%7D%5E%7B(K)%7D%20%5C%5C%20%5Ctheta_%7Bm%7D%5E%7B(K+1)%7D(x)%20%5C%5C%20%5Ctheta_%7Bm%7D%5E%7B(K+2)%7D(x)%0A%5Cend%7Bpmatrix%7D%20=%0A%5Cbegin%7Bpmatrix%7D%0A%5Ctheta_%7Bm-1%7D%5E%7B(1)%7D%20%5C%5C%20%5Cvdots%20%5C%5C%20%5Ctheta_%7Bm-1%7D%5E%7B(K)%7D%20%5C%5C%20%5Ctheta_%7Bm-1%7D%5E%7B(K+1)%7D(x)%20%5C%5C%20%5Ctheta_%7Bm-1%7D%5E%7B(K+2)%7D(x)%0A%5Cend%7Bpmatrix%7D%20-%20%5Cgamma%20%5Ccdot%0A%5Cbegin%7Bpmatrix%7D%0A%5Cfrac%7B%5Cpartial%20L(%5Ctheta_%7Bm-1%7D(x),x,y)%7D%7B%5Cpartial%20%5Ctheta_%7Bm-1%7D%5E%7B(1)%7D%7D%20%5C%5C%20%5Cvdots%20%5C%5C%20%5Cfrac%7B%5Cpartial%20L(%5Ctheta_%7Bm-1%7D(x),x,y)%7D%7B%5Cpartial%20%5Ctheta_%7Bm-1%7D%5E%7B(K)%7D%7D%20%5C%5C%20%5Cfrac%7B%5Cpartial%20L(%5Ctheta_%7Bm-1%7D(x),x,y)%7D%7B%5Cpartial%20%5Ctheta_%7Bm-1%7D%5E%7B(K+1)%7D(x)%7D%20%5C%5C%20%5Cfrac%7B%5Cpartial%20L(%5Ctheta_%7Bm-1%7D(x),x,y)%7D%7B%5Cpartial%20%5Ctheta_%7Bm-1%7D%5E%7B(K+2)%7D(x)%7D%0A%5Cend%7Bpmatrix%7D%0A"></p>
<p>For clarity, our loss now also depends explicitly on the input <img src="https://latex.codecogs.com/png.latex?x">, not only via <img src="https://latex.codecogs.com/png.latex?%5Ctheta(x)">. The first <img src="https://latex.codecogs.com/png.latex?K"> elements are then updated in</p>
<div class="sourceCode" id="cb17" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb17-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _update_network(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb17-2">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1.weight.grad</span>
<span id="cb17-3">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2.weight.grad</span>
<span id="cb17-4">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1.weight.grad</span>
<span id="cb17-5">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2.weight.data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2.weight.grad</span>
<span id="cb17-6"></span>
<span id="cb17-7">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv1.weight.grad.zero_()</span>
<span id="cb17-8">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conv2.weight.grad.zero_()</span>
<span id="cb17-9">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1.weight.grad.zero_()</span>
<span id="cb17-10">    <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2.weight.grad.zero_()</span></code></pre></div>
<p>while the Boosting models are updated in</p>
<div class="sourceCode" id="cb18" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb18-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _update_booster(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb18-2">    tree_list <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb18-3">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_booster):</span>
<span id="cb18-4">        tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth, min_samples_leaf<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>)</span>
<span id="cb18-5">        tree.fit(X, y[:,c])</span>
<span id="cb18-6">        tree_list.append(tree)</span>
<span id="cb18-7">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.boosted_trees.append(tree_list)</span></code></pre></div>
<p>As we are dealing with binary classification, we can use the <a href="https://en.wikipedia.org/wiki/Cross_entropy#Cross-entropy_loss_function_and_logistic_regression">Binary Cross-Entropy</a> loss directly from PyTorch. Notice that we use the logits for the training steps, i.e.&nbsp;the network output before applying the sigmoid activation function.</p>
<p>To speed things up, we also allow for random batch sampling which effectively makes this a Stochastic Gradient Descent algorithm. This is not strictly necessary, but it makes the training process much faster, in particular for the Boosting models.</p>
<p>Now, let us test our model on the MNIST dataset:</p>
<div id="9d31f650" class="cell" data-execution_count="69">
<div class="sourceCode cell-code" id="cb19" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb19-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> torchvision</span>
<span id="cb19-2"></span>
<span id="cb19-3">mnist_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torchvision.datasets.MNIST(root<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'./data'</span>, train<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, download<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb19-4">mnist_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torchvision.datasets.MNIST(root<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'./data'</span>, train<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>, download<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb19-5"></span>
<span id="cb19-6">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mnist_train.data.numpy()</span>
<span id="cb19-7">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mnist_train.targets.numpy()</span>
<span id="cb19-8"></span>
<span id="cb19-9">X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mnist_test.data.numpy()</span>
<span id="cb19-10">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mnist_test.targets.numpy()</span>
<span id="cb19-11"></span>
<span id="cb19-12">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train[(y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> (y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]</span>
<span id="cb19-13">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train[(y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> (y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]</span>
<span id="cb19-14"></span>
<span id="cb19-15">X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_test[(y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> (y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]</span>
<span id="cb19-16">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_test[(y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|</span> (y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]</span>
<span id="cb19-17"></span>
<span id="cb19-18"></span>
<span id="cb19-19">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>)</span>
<span id="cb19-20">X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>)</span></code></pre></div>
</div>
<div id="68f3bc42" class="cell" data-execution_count="70">
<div class="sourceCode cell-code" id="cb20" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb20-1">np.random.normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb20-2">torch.manual_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb20-3"></span>
<span id="cb20-4">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ConvolutionalNetWithBoostedTrees(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.001</span>)</span>
<span id="cb20-5">model.fit(X_train, y_train, batch_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>)</span>
<span id="cb20-6"></span>
<span id="cb20-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#evaluate</span></span>
<span id="cb20-8">y_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.predict(X_test)</span>
<span id="cb20-9">np.mean(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">round</span>(y_pred) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> y_test)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="70">
<pre><code>0.992434988179669</code></pre>
</div>
</div>
<p>Although this is a toy problem, almost 100% test accuracy is good. We can have confidence that our approach works in principle. As some final shenanigans, let us also consider a measure of local feature importance, concerning the boosting models. For a given input, each pixel and all Decision Trees, we count how often the given pixel (=feature) is encountered in each split rule. Then we can plot the result in an image plot:</p>
<div id="8783d58e" class="cell" data-execution_count="87">
<div class="sourceCode cell-code" id="cb22" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb22-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># get all features in each node from root to leaf that target_x traverses in target_tree</span></span>
<span id="cb22-2"></span>
<span id="cb22-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_features_in_path(target_tree, target_x):</span>
<span id="cb22-4">    nodes <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb22-5">    node <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb22-6">    nodes.append(node)</span>
<span id="cb22-7">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">while</span> node <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:</span>
<span id="cb22-8">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> target_x[target_tree.tree_.feature[node]] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> target_tree.tree_.threshold[node]:</span>
<span id="cb22-9">            node <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> target_tree.tree_.children_left[node]</span>
<span id="cb22-10">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb22-11">            node <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> target_tree.tree_.children_right[node]</span>
<span id="cb22-12">        nodes.append(node)</span>
<span id="cb22-13"></span>
<span id="cb22-14">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get all features in each node from root to leaf</span></span>
<span id="cb22-15">    feature_importances <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>)</span>
<span id="cb22-16">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> node <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> nodes:</span>
<span id="cb22-17">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> node <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:</span>
<span id="cb22-18">            target_feature <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> target_tree.tree_.feature[node]</span>
<span id="cb22-19">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> target_feature <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:</span>
<span id="cb22-20">                feature_importances[target_feature] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb22-21">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> feature_importances.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>)</span>
<span id="cb22-22"></span>
<span id="cb22-23">target_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_test[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb22-24"></span>
<span id="cb22-25">feature_importance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>))</span>
<span id="cb22-26"></span>
<span id="cb22-27"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree_list <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> model.boosted_trees:</span>
<span id="cb22-28">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> tree_list:</span>
<span id="cb22-29">        feature_importance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> get_features_in_path(tree, target_x) </span>
<span id="cb22-30"></span>
<span id="cb22-31">plt.imshow(target_x.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>), cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'gray'</span>)</span>
<span id="cb22-32">plt.imshow(feature_importance, cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Greens'</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.85</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-15-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<div id="32349ba4" class="cell" data-execution_count="88">
<div class="sourceCode cell-code" id="cb23" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb23-1">target_x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_test[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb23-2"></span>
<span id="cb23-3">feature_importance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>))</span>
<span id="cb23-4"></span>
<span id="cb23-5"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree_list <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> model.boosted_trees:</span>
<span id="cb23-6">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> tree_list:</span>
<span id="cb23-7">        feature_importance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> get_features_in_path(tree, target_x) </span>
<span id="cb23-8"></span>
<span id="cb23-9">plt.imshow(target_x.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">28</span>), cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'gray'</span>)</span>
<span id="cb23-10">plt.imshow(feature_importance, cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Greens'</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.85</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything_files/figure-html/cell-16-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This could be interpreted as follows: The primary region of interest is around the center of the image (the deep green dots). 1s are mostly colored around there, while 0s will usually be blank around the center. In the bottom case of the 0, we see also some light regions at the left and right edges of the digit being of interest.</p>
<p>Keep in mind that this is just a freestyled measure of feature importance. We don’t even consider the remaining CNN parts of our model which is clearly insufficient. On the other hand, the output looks still reasonable and gives us at least some idea of what the model is doing.</p>
<p>In addition, larger or actual RGB images will likely be much harder to get right with this kind of model in general. Nevertheless, we have created a Gradient Boosting model where using PyTorch is actually a life-saver.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>As we have seen, marrying Gradient Boosting with PyTorch can be a useful approach. Although manual derivatives will be faster, PyTorch allows us to handle arbitrarily complex models holistically. Since Gradient Boosting is generally a very powerful model, this might create opportunity to improve existing approaches that usually don’t include a Boosting component.</p>
<p>This also comes with a slight improvement for interpretability, as tree based methods provide at least some insight into a model’s inner workings. In one of the next articles, we’ll take a look how this all can also be used for time series models.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Friedman, Jerome H. Greedy function approximation: A gradient boosting machine. Annals of statistics, 2001</p>
<p><strong>[2]</strong> Hastie, Trevor; Tishbirani, Robert. Varying-coefficient models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 1993</p>
<p><strong>[2]</strong> Paszke, Adam, et al.&nbsp;Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 2019</p>


</section>

 ]]></description>
  <category>Decision Trees</category>
  <category>Gradient Boosting</category>
  <guid>https://www.sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything.html</guid>
  <pubDate>Thu, 18 Jan 2024 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Varying Coefficient Boosting for geospatial and temporal data</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/varying-coefficient-boosting-for-geospatial-and-temporal-data.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p><a href="https://sarem-seitz.com/posts/with-pytorch-i-can-gradient-boost-anything/#:~:text=Varying%20Coefficient%20Boosting">Last time</a>, amongst other ideas, we looked at how to implement Varying Coefficient Boosting in PyTorch. These types of models are quite useful, as they are considerably flexible and (locally) interpretable at the same time.</p>
<p>By using Boosted Decision Trees, we even gain some interpretability for the coefficient functions themselves. The well-known predictive performance of Gradient Boosting also seems to apply for such models. Personally, I believe that these two aspects makes Gradient Boosting with Decision Trees the preferred base-method for the Varying Coefficient approach.</p>
<p>Today, we will look at how to apply this approach to geospatial data and data with a prevalent temporal component. Our main goal is to let the coefficients vary over space and/or time. This should improve interpretability even further - the model coefficients will only change per region or per time period. Especially for geospatial data, we can then make some neat geo-plots to visualize the results.</p>
</section>
<section id="adapting-the-varying-coefficient-model-from-before" class="level2">
<h2 class="anchored" data-anchor-id="adapting-the-varying-coefficient-model-from-before">Adapting the Varying Coefficient model from before</h2>
<p>If you take a close look at the previous implementation of Varying Coefficient Boosting, you will notice one important aspect: Our model presumed that the coefficient function features and the regression features are equivalent. I.e., in the model formula</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7By%7D%20=%20F%5E%7B(0)%7D(x)%20+%20F%5E%7B(1)%7D(x)%5Ccdot%20x%5E%7B(1)%7D%20+%20%5Ccdots%20F%5E%7B(M)%7D(x)%5Ccdot%20x%5E%7B(M)%7D,"></p>
<p>we could not allow the <img src="https://latex.codecogs.com/png.latex?x"> in <img src="https://latex.codecogs.com/png.latex?F%5E%7B(m)%7D(x)"> to be different from the <img src="https://latex.codecogs.com/png.latex?x%5E%7B(m)%7D"> in <img src="https://latex.codecogs.com/png.latex?x%5E%7B(m)%7D">. Now, however, input vector <img src="https://latex.codecogs.com/png.latex?x"> is presumed to contain, for example, <img src="https://latex.codecogs.com/png.latex?x%5E%5Ctext%7Blat%7D"> and <img src="https://latex.codecogs.com/png.latex?x%5E%5Ctext%7Blon%7D"> or <img src="https://latex.codecogs.com/png.latex?x%5E%5Ctext%7Byear%7D"> and <img src="https://latex.codecogs.com/png.latex?x%5E%5Ctext%7Bmonth%7D">. We therefore only want those features to be used in the <img src="https://latex.codecogs.com/png.latex?F%5E%7B(m)%7D(x)">. For the actual regressors, we will use the remaining features, <img src="https://latex.codecogs.com/png.latex?x%5Csetminus%20%5C%7Bx%5E%5Ctext%7Blat%7D,%20x%5E%5Ctext%7Blon%7D,...%5C%7D">.</p>
<p>Thus, we introduce <code>X_reg</code> and <code>X_coeff</code> in the <code>.fit</code> and <code>.predict</code> methods to differentiate both feature sets. Adjusting the existing model class is then fairly straightforward:</p>
<div id="1b916bc3-3a76-4ce5-8d05-85c9de551758" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mpl_toolkits.basemap <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Basemap</span>
<span id="cb1-6"></span>
<span id="cb1-7"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.tree <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> DecisionTreeRegressor</span>
<span id="cb1-8"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.datasets <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> fetch_california_housing</span>
<span id="cb1-9"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.datasets <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> fetch_openml</span>
<span id="cb1-10"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.model_selection <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> train_test_split</span>
<span id="cb1-11"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.ensemble <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> GradientBoostingRegressor</span>
<span id="cb1-12"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.preprocessing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> StandardScaler</span>
<span id="cb1-13"></span>
<span id="cb1-14"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> json</span>
<span id="cb1-15"></span>
<span id="cb1-16"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> torch</span>
<span id="cb1-17"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> torch.autograd <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Variable</span>
<span id="cb1-18"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> torch.optim <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> optim</span>
<span id="cb1-19"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> torch.nn <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> nn</span>
<span id="cb1-20"></span>
<span id="cb1-21"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> List, Optional, Union</span>
<span id="cb1-22"></span>
<span id="cb1-23"></span>
<span id="cb1-24"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">###Model</span></span>
<span id="cb1-25"></span>
<span id="cb1-26"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> VaryingCoefficientGradientBoosting:</span>
<span id="cb1-27"></span>
<span id="cb1-28">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>,</span>
<span id="cb1-29">                 learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>,</span>
<span id="cb1-30">                 max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb1-31">                 n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>):</span>
<span id="cb1-32"></span>
<span id="cb1-33">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> learning_rate</span>
<span id="cb1-34">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> max_depth</span>
<span id="cb1-35">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_estimators</span>
<span id="cb1-36"></span>
<span id="cb1-37">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs: Optional[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span></span>
<span id="cb1-38">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.coeff_trees: List[List[DecisionTreeRegressor]] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-39">        </span>
<span id="cb1-40">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained: <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb1-41"></span>
<span id="cb1-42"></span>
<span id="cb1-43">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@property</span></span>
<span id="cb1-44">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> n_coefficients(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>:</span>
<span id="cb1-45">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained:</span>
<span id="cb1-46">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-47">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb1-48">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span> </span>
<span id="cb1-49"></span>
<span id="cb1-50"></span>
<span id="cb1-51">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_reg: np.array, X_coeff: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-52">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb1-53"></span>
<span id="cb1-54">        X_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X_reg), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X_reg], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-55"></span>
<span id="cb1-56">        coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_coeffs(X_coeff)</span>
<span id="cb1-57">        predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(X_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> coeffs, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-58"></span>
<span id="cb1-59">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> predictions</span>
<span id="cb1-60"></span>
<span id="cb1-61"></span>
<span id="cb1-62">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_raw(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_coeff: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-63">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb1-64">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_coeffs(X_coeff)</span>
<span id="cb1-65">    </span>
<span id="cb1-66"></span>
<span id="cb1-67">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> fit(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_reg: np.array, X_coeff: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-68">        X_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.ones(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X_reg), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), X_reg], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-69">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._fit_initial(X_reg, y)</span>
<span id="cb1-70"></span>
<span id="cb1-71">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb1-72"></span>
<span id="cb1-73">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_estimators):</span>
<span id="cb1-74">            coeff_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._predict_raw(X_coeff)</span>
<span id="cb1-75"></span>
<span id="cb1-76">            gradients <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._get_gradients(X_reg, y, coeff_pred)</span>
<span id="cb1-77"></span>
<span id="cb1-78">            new_trees <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-79"></span>
<span id="cb1-80">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_coefficients):</span>
<span id="cb1-81">                coeff_tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.max_depth)</span>
<span id="cb1-82">                coeff_tree.fit(X_coeff, gradients[:,c])</span>
<span id="cb1-83"></span>
<span id="cb1-84">                new_trees.append(coeff_tree)</span>
<span id="cb1-85"></span>
<span id="cb1-86">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.coeff_trees.append(new_trees)</span>
<span id="cb1-87"></span>
<span id="cb1-88"></span>
<span id="cb1-89">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _fit_initial(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_reg: np.array, y: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-90">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">assert</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.is_trained</span>
<span id="cb1-91"></span>
<span id="cb1-92">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, X_reg.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]))</span>
<span id="cb1-93">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(y)</span>
<span id="cb1-94"></span>
<span id="cb1-95"></span>
<span id="cb1-96">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _get_gradients(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_reg: np.array, y: np.array, coeff_pred: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-97">        X_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_reg).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb1-98">        y_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb1-99">        y_pred_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Variable(torch.tensor(coeff_pred).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(), requires_grad<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb1-100"></span>
<span id="cb1-101">        sse <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (y_torch<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>(X_torch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> y_pred_torch).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">pow</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.0</span>).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>() <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#negative sse to get negative gradient</span></span>
<span id="cb1-102">        sse.backward() </span>
<span id="cb1-103">        grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_pred_torch.grad.numpy()</span>
<span id="cb1-104"></span>
<span id="cb1-105">        grads[grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)</span>
<span id="cb1-106">        grads[grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(grads, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span>
<span id="cb1-107"></span>
<span id="cb1-108">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> grads</span>
<span id="cb1-109"></span>
<span id="cb1-110"></span>
<span id="cb1-111">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _predict_coeffs(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_coeff: np.array) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> np.array:</span>
<span id="cb1-112">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.zeros(shape <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X_coeff), <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_coefficients))</span>
<span id="cb1-113"></span>
<span id="cb1-114">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.init_coeffs</span>
<span id="cb1-115"></span>
<span id="cb1-116">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> tree_list <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.coeff_trees:</span>
<span id="cb1-117">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_coefficients):</span>
<span id="cb1-118">                output[:,c] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> tree_list[c].predict(X_coeff)</span>
<span id="cb1-119"></span>
<span id="cb1-120">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> output</span></code></pre></div>
</div>
<p>Not too difficult. Next, we can directly apply this updated model to the <a href="https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html">California Housing dataset</a>.</p>
<section id="varying-coefficient-boosting-for-california-housing" class="level3">
<h3 class="anchored" data-anchor-id="varying-coefficient-boosting-for-california-housing">Varying Coefficient Boosting for California Housing</h3>
<p>To accomodate for our updated model, we obviously need to split our features into <code>X_reg</code> and <code>X_coeff</code>, too. Since we also want to keep using <code>train_test_split</code> from <code>sklearn</code>, which expects only one feature matrix, we apply the latter first. Also, for comparison, we will fit a regular Gradient Boosting model to the data. This requires another set of features, namely <code>X_train_scaled</code> and <code>X_test_scaled</code> where we use the full feature set for the standard Boosting model:</p>
<div id="d3f39446" class="cell" data-execution_count="12">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">### Data</span></span>
<span id="cb2-2"></span>
<span id="cb2-3">housing <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fetch_california_housing()</span>
<span id="cb2-4">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> housing.data</span>
<span id="cb2-5"></span>
<span id="cb2-6">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> housing.target</span>
<span id="cb2-7">X_train, X_test, y_train, y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_test_split(X, y, test_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.33</span>, random_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb2-8"></span>
<span id="cb2-9">X_train_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train[:, :<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]</span>
<span id="cb2-10">X_train_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train[:, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:]</span>
<span id="cb2-11">X_test_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_test[:, :<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]</span>
<span id="cb2-12">X_test_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_test[:, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:]</span>
<span id="cb2-13"></span>
<span id="cb2-14">X_reg_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(X_train_reg, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-15">X_reg_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(X_train_reg, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-16"></span>
<span id="cb2-17">X_train_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_train_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_reg_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_reg_std</span>
<span id="cb2-18">X_test_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_test_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_reg_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_reg_std</span>
<span id="cb2-19"></span>
<span id="cb2-20">X_full_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(X_train, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-21">X_full_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(X_train, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-22"></span>
<span id="cb2-23">X_train_scaled <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_full_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_full_std</span>
<span id="cb2-24">X_test_scaled <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_full_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_full_std</span>
<span id="cb2-25"></span>
<span id="cb2-26">y_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(y_train)</span>
<span id="cb2-27">y_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(y_train)</span>
<span id="cb2-28"></span>
<span id="cb2-29">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> y_std</span>
<span id="cb2-30">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> y_std</span>
<span id="cb2-31"></span>
<span id="cb2-32"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">### Models</span></span>
<span id="cb2-33">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb2-34"></span>
<span id="cb2-35">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientGradientBoosting(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb2-36">model.fit(X_train_reg, X_train_coeff, y_train)</span>
<span id="cb2-37"></span>
<span id="cb2-38">predictions_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.predict(X_test_reg, X_test_coeff)</span>
<span id="cb2-39"></span>
<span id="cb2-40">rmse_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb2-41"></span>
<span id="cb2-42"></span>
<span id="cb2-43">gb_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> GradientBoostingRegressor(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb2-44">gb_model.fit(X_train_scaled, y_train)</span>
<span id="cb2-45"></span>
<span id="cb2-46">predictions_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gb_model.predict(X_test_scaled)</span>
<span id="cb2-47"></span>
<span id="cb2-48">rmse_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span></code></pre></div>
</div>
<p>For further comparison, we also fit a Varying Coefficient Neural Network. As with Varying Coefficient Boosting, the geographical features are used for the coefficient functions, while the remaining ones are used as regressors. The model is implemented in PyTorch, using the <code>nn</code> module. The model is fairly simple, with only one hidden layer:</p>
<div id="dda8c1c1" class="cell" data-execution_count="13">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> VaryingCoefficientNeuralNetwork(nn.Module):</span>
<span id="cb3-2">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, input_dim, varying_input_dim, hidden_dim):</span>
<span id="cb3-3">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>(VaryingCoefficientNeuralNetwork, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>).<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb3-4">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.input_dim <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> input_dim</span>
<span id="cb3-5">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.varying_input_dim <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> varying_input_dim</span>
<span id="cb3-6">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.hidden_dim <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> hidden_dim</span>
<span id="cb3-7"></span>
<span id="cb3-8">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(varying_input_dim, hidden_dim)</span>
<span id="cb3-9">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(hidden_dim, hidden_dim)</span>
<span id="cb3-10">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc3 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.Linear(hidden_dim, input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-11"></span>
<span id="cb3-12">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> forward(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, x, x_varying):</span>
<span id="cb3-13">        varying_coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc3(torch.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2(torch.relu(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1(x_varying)))))</span>
<span id="cb3-14">        x_with_ones <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.cat([torch.ones(x.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), x], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-15"></span>
<span id="cb3-16">        result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(x_with_ones <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> varying_coeffs, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) </span>
<span id="cb3-17">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> result</span>
<span id="cb3-18"></span>
<span id="cb3-19">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict_coefficients(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, x_varying):</span>
<span id="cb3-20">        varying_coeffs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc2(nn.Softplus()(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.fc1(x_varying)))</span>
<span id="cb3-21">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> varying_coeffs</span>
<span id="cb3-22"></span>
<span id="cb3-23">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb3-24">torch.manual_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb3-25"></span>
<span id="cb3-26">model_net <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientNeuralNetwork(input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>X_train_reg.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, varying_input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>X_train_coeff.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], hidden_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb3-27">criterion <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.MSELoss()</span>
<span id="cb3-28">optimizer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> optim.Adam(model_net.parameters(), lr<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.001</span>)</span>
<span id="cb3-29"></span>
<span id="cb3-30">X_train_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_train_reg).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb3-31">X_train_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.cat([torch.ones(X_train_reg_tensor.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), X_train_reg_tensor], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-32"></span>
<span id="cb3-33">X_test_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_test_reg).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb3-34">X_test_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.cat([torch.ones(X_test_reg_tensor.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), X_test_reg_tensor], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-35"></span>
<span id="cb3-36">X_train_coeff_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_train_coeff).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb3-37">X_test_coeff_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_test_coeff).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb3-38"></span>
<span id="cb3-39">y_train_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y_train).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb3-40"></span>
<span id="cb3-41">num_epochs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5000</span></span>
<span id="cb3-42"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> epoch <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(num_epochs):</span>
<span id="cb3-43">    optimizer.zero_grad()</span>
<span id="cb3-44">    outputs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_net(X_train_reg_tensor, X_train_coeff_tensor)</span>
<span id="cb3-45">    loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> criterion(outputs, y_train_tensor)</span>
<span id="cb3-46">    loss.backward()</span>
<span id="cb3-47">    optimizer.step()</span>
<span id="cb3-48"></span>
<span id="cb3-49">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_net(X_test_reg_tensor, X_test_coeff_tensor)</span>
<span id="cb3-50"></span>
<span id="cb3-51">rmse_var_coeff_net <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions.detach().numpy() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb3-52"></span>
<span id="cb3-53"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for Varying Coefficient Boosting: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_varcoeff<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb3-54"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for Gradient Boosting: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_gb<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb3-55"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for varying coefficient Neural Network: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_var_coeff_net<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>RMSE for Varying Coefficient Boosting: 0.5317705220008321
RMSE for Gradient Boosting: 0.4893399365654721
RMSE for varying coefficient Neural Network: 0.6654126590680458</code></pre>
</div>
</div>
<p>By giving up a minor amount of accuracy, we can get a model that is much more interpretable. Now, we can visualize the coefficients on a map:</p>
<div id="651274ac" class="cell" data-execution_count="14">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Define the range of latitude and longitude</span></span>
<span id="cb5-2">lat_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]), np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb5-3">lon_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb5-4"></span>
<span id="cb5-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create the meshgrid</span></span>
<span id="cb5-6">lat_mesh, lon_mesh <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.meshgrid(lat_range, lon_range)</span>
<span id="cb5-7"></span>
<span id="cb5-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Flatten the meshgrid</span></span>
<span id="cb5-9">lat_lon_mesh <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([lat_mesh.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), lon_mesh.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)  </span>
<span id="cb5-10"></span>
<span id="cb5-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Predict using the model</span></span>
<span id="cb5-12">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model._predict_coeffs(lat_lon_mesh)[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb5-13"></span>
<span id="cb5-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a basemap</span></span>
<span id="cb5-15">m <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Basemap(llcrnrlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), llcrnrlat<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]),</span>
<span id="cb5-16">            urcrnrlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), urcrnrlat<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]),</span>
<span id="cb5-17">            projection<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'lcc'</span>, lat_0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.mean(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]), lon_0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.mean(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]))</span>
<span id="cb5-18"></span>
<span id="cb5-19"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a contour plot</span></span>
<span id="cb5-20">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>))</span>
<span id="cb5-21">m.contourf(lon_mesh, lat_mesh, predictions.reshape(lat_mesh.shape), cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'coolwarm'</span>,latlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, levels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>)</span>
<span id="cb5-22">plt.colorbar(label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Predictions'</span>)</span>
<span id="cb5-23">m.scatter(X[:, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], X[:, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>], latlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>y, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-24">m.drawcoastlines()</span>
<span id="cb5-25">m.drawstates()</span>
<span id="cb5-26">m.drawcountries()</span>
<span id="cb5-27"></span>
<span id="cb5-28">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'MedianIncome Coefficient (standardized)'</span>)</span>
<span id="cb5-29">plt.show()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-boosting-for-geospatial-and-temporal-data_files/figure-html/cell-5-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>We see that the effect of <code>MedInc</code> is relatively stable throughout the many areas. In some coastal regions, particularly around Los Angeles and southern San Franscico/Palo Alto, the effect of Median Income is slightly higher. Compare this to the estimated, varying coefficient according to the Varying Coefficient Neural Network:</p>
<div id="bfb0dfb4" class="cell" data-execution_count="15">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Predict using the model</span></span>
<span id="cb6-2">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_net.predict_coefficients(torch.tensor(lat_lon_mesh).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>())[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].detach().numpy()</span>
<span id="cb6-3"></span>
<span id="cb6-4"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a basemap</span></span>
<span id="cb6-5">m <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Basemap(llcrnrlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), llcrnrlat<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]),</span>
<span id="cb6-6">            urcrnrlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), urcrnrlat<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]),</span>
<span id="cb6-7">            projection<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'lcc'</span>, lat_0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.mean(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]), lon_0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>np.mean(X_train_coeff[:, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]))</span>
<span id="cb6-8"></span>
<span id="cb6-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create a contour plot</span></span>
<span id="cb6-10">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>))</span>
<span id="cb6-11">m.contourf(lon_mesh, lat_mesh, predictions.reshape(lat_mesh.shape), cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'coolwarm'</span>,latlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, levels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>)</span>
<span id="cb6-12">plt.colorbar(label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'Predictions'</span>)</span>
<span id="cb6-13">m.scatter(X[:, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], X[:, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>], latlon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>y, s<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb6-14">m.drawcoastlines()</span>
<span id="cb6-15">m.drawstates()</span>
<span id="cb6-16">m.drawcountries()</span>
<span id="cb6-17"></span>
<span id="cb6-18">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'MedianIncome Coefficient (standardized; Neural Network estimate)'</span>)</span>
<span id="cb6-19">plt.show()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-boosting-for-geospatial-and-temporal-data_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>With the Varying Coefficient Neural Network, the coefficient variation is much different from before. Instead of rectangular areas, the coefficients are now varying much more smoothly. This is obviously due to Neural Networks only being able to model smooth functions. Boosted Trees on the the other hand can also account for non-smooth functions that change rapidly within a small area. With regards to geospatial data, this can obviously be limiting - here, for example, if neighborhood conditions change rather rapdidly.</p>
<p>Nevertheless, the model also accounts for coastal homes’ prices being more influenced by Median Income.</p>
</section>
<section id="varying-coefficient-boosting-for-bike-sharing-demand" class="level3">
<h3 class="anchored" data-anchor-id="varying-coefficient-boosting-for-bike-sharing-demand">Varying Coefficient Boosting for Bike Sharing Demand</h3>
<p>Next up, we take a look at the <a href="https://www.kaggle.com/c/bike-sharing-demand">Bike Sharing Demand</a> dataset from Kaggle. The dataset contains hourly bike rental data from Washington D.C. Month, weekday and hour are obviously key features here. Thus,</p>
<div id="798d8a57" class="cell" data-execution_count="16">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"></span>
<span id="cb7-2">bike_sharing <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fetch_openml(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Bike_Sharing_Demand"</span>, version<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, as_frame<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb7-3">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> bike_sharing.frame</span>
<span id="cb7-4"></span>
<span id="cb7-5">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb7-6">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"time"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(X)) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#add time axis for linear trend</span></span>
<span id="cb7-7">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[:,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb7-8"></span>
<span id="cb7-9">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month_sin"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)</span>
<span id="cb7-10">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month_cos"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cos(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)</span>
<span id="cb7-11">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"hour_sin"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"hour"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">24</span>)</span>
<span id="cb7-12">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"hour_cos"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cos(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"hour"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">24</span>)</span>
<span id="cb7-13">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weekday_sin"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weekday"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>)</span>
<span id="cb7-14">X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weekday_cos"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cos(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> X[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weekday"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>)</span>
<span id="cb7-15"></span>
<span id="cb7-16">X.drop([<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"month"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"hour"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weekday"</span>], axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, inplace<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb7-17"></span>
<span id="cb7-18"></span>
<span id="cb7-19"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Create dummy variables for the specified columns</span></span>
<span id="cb7-20">dummy_cols <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"holiday"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"workingday"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weather"</span>]</span>
<span id="cb7-21">dummy_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.get_dummies(X[dummy_cols], drop_first<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb7-22"></span>
<span id="cb7-23"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Concatenate the dummy variables with the remaining columns</span></span>
<span id="cb7-24">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([X.drop(dummy_cols, axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), dummy_df], axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-25"></span>
<span id="cb7-26">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>]</span>
<span id="cb7-27">X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>:]</span>
<span id="cb7-28"></span>
<span id="cb7-29">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(y.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-30">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(y.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>:]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-31"></span>
<span id="cb7-32">data_features <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weather_heavy_rain"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weather_misty"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"weather_rain"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"temp"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"feel_temp"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"humidity"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"windspeed"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"time"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"holiday_False"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"workingday_False"</span>]</span>
<span id="cb7-33"></span>
<span id="cb7-34">X_train_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.float32(X_train[data_features])</span>
<span id="cb7-35">X_train_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.float32(X_train.drop(data_features, axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb7-36"></span>
<span id="cb7-37">X_test_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.float32(X_test[data_features])</span>
<span id="cb7-38">X_test_coeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.float32(X_test.drop(data_features, axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb7-39"></span>
<span id="cb7-40">X_reg_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(X_train_reg, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb7-41">X_reg_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(X_train_reg, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb7-42"></span>
<span id="cb7-43">X_train_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_train_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_reg_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_reg_std</span>
<span id="cb7-44">X_test_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_test_reg <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_reg_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_reg_std</span>
<span id="cb7-45"></span>
<span id="cb7-46">X_full_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(X_train, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb7-47">X_full_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(X_train, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb7-48"></span>
<span id="cb7-49">X_train_scaled <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_full_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_full_std</span>
<span id="cb7-50">X_test_scaled <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> X_full_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> X_full_std</span>
<span id="cb7-51"></span>
<span id="cb7-52">y_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(y_train)</span>
<span id="cb7-53">y_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(y_train)</span>
<span id="cb7-54"></span>
<span id="cb7-55">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> y_std</span>
<span id="cb7-56">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_mean) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> y_std</span>
<span id="cb7-57"></span></code></pre></div>
</div>
<div id="9cbb6552" class="cell" data-execution_count="17">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb8-2"></span>
<span id="cb8-3">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientGradientBoosting(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb8-4">model.fit(X_train_reg, X_train_coeff, y_train)</span>
<span id="cb8-5"></span>
<span id="cb8-6">predictions_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.predict(X_test_reg, X_test_coeff)</span>
<span id="cb8-7"></span>
<span id="cb8-8">rmse_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions_varcoeff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb8-9"></span>
<span id="cb8-10"></span>
<span id="cb8-11">gb_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> GradientBoostingRegressor(n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, learning_rate <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>)</span>
<span id="cb8-12">gb_model.fit(X_train_scaled, y_train)</span>
<span id="cb8-13"></span>
<span id="cb8-14">predictions_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gb_model.predict(X_test_scaled)</span>
<span id="cb8-15"></span>
<span id="cb8-16">rmse_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions_gb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span></code></pre></div>
</div>
<div id="8e059264" class="cell" data-execution_count="18">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb9-2">torch.manual_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb9-3"></span>
<span id="cb9-4">model_net <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientNeuralNetwork(input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>X_train_reg.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], varying_input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>X_train_coeff.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], hidden_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb9-5">criterion <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.MSELoss()</span>
<span id="cb9-6">optimizer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> optim.Adam(model_net.parameters(), lr<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.001</span>)</span>
<span id="cb9-7"></span>
<span id="cb9-8">model_net <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VaryingCoefficientNeuralNetwork(input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>X_train_reg.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, varying_input_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>X_train_coeff.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], hidden_dim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb9-9">criterion <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nn.MSELoss()</span>
<span id="cb9-10">optimizer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> optim.Adam(model_net.parameters(), lr<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.001</span>)</span>
<span id="cb9-11"></span>
<span id="cb9-12">X_train_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_train_reg).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb9-13">X_train_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.cat([torch.ones(X_train_reg_tensor.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), X_train_reg_tensor], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb9-14"></span>
<span id="cb9-15">X_test_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_test_reg).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb9-16">X_test_reg_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.cat([torch.ones(X_test_reg_tensor.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), X_test_reg_tensor], <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb9-17"></span>
<span id="cb9-18">X_train_coeff_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_train_coeff).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb9-19">X_test_coeff_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(X_test_coeff).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb9-20"></span>
<span id="cb9-21">y_train_tensor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(y_train).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()</span>
<span id="cb9-22"></span>
<span id="cb9-23">num_epochs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5000</span></span>
<span id="cb9-24"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> epoch <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(num_epochs):</span>
<span id="cb9-25">    optimizer.zero_grad()</span>
<span id="cb9-26">    outputs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_net(X_train_reg_tensor, X_train_coeff_tensor)</span>
<span id="cb9-27">    loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> criterion(outputs, y_train_tensor)</span>
<span id="cb9-28">    loss.backward()</span>
<span id="cb9-29">    optimizer.step()</span>
<span id="cb9-30"></span>
<span id="cb9-31">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_net(X_test_reg_tensor, X_test_coeff_tensor)</span>
<span id="cb9-32"></span>
<span id="cb9-33">rmse_var_coeff_net <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((predictions.detach().numpy() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> y_test)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-34"></span>
<span id="cb9-35"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for Varying Coefficient Boosting: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_varcoeff<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb9-36"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for Gradient Boosting: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_gb<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb9-37"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"RMSE for varying coefficient Neural Network: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>rmse_var_coeff_net<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>RMSE for Varying Coefficient Boosting: 0.4491864065123828
RMSE for Gradient Boosting: 0.4917861604236747
RMSE for varying coefficient Neural Network: 0.4621447551672892</code></pre>
</div>
</div>
<p>This time, the Varying Coefficient Boosting model even outperforms the standard Gradient Boosting model. Let us now compare how the Varying Coefficient Boosting model and the Varying Coefficient Neural Network model estimate the coefficient for the <code>temp</code> feature:</p>
<div id="44b78170" class="cell" data-execution_count="24">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1">result_dict <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {}</span>
<span id="cb11-2"></span>
<span id="cb11-3">index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb11-4">datalist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb11-5"></span>
<span id="cb11-6"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> hour <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">24</span>):</span>
<span id="cb11-7">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> month <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>):</span>
<span id="cb11-8">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> weekday <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>):</span>
<span id="cb11-9">            datalist.append([</span>
<span id="cb11-10">                np.sin(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> hour <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">24</span>),</span>
<span id="cb11-11">                np.cos(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> hour <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">24</span>),</span>
<span id="cb11-12">                np.sin(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> month <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>),</span>
<span id="cb11-13">                np.cos(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> month <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>),</span>
<span id="cb11-14">                np.sin(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> weekday <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>),</span>
<span id="cb11-15">                np.cos(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.pi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> weekday <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>),                </span>
<span id="cb11-16">            ])</span>
<span id="cb11-17"></span>
<span id="cb11-18">            index.append(<span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>hour<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">,</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>month<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">,</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>weekday<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>)</span>
<span id="cb11-19"></span>
<span id="cb11-20"></span>
<span id="cb11-21">predictions_vcb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model._predict_coeffs(np.array(datalist))</span>
<span id="cb11-22">predictions_vcnn <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_net.predict_coefficients(torch.tensor(np.array(datalist)).<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>()).detach().numpy()</span>
<span id="cb11-23"></span>
<span id="cb11-24"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(predictions)):</span>
<span id="cb11-25">    result_dict[index[i]] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"vcb"</span>: <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>predictions_vcb[i,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.4f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> + ... + </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>predictions_vcb[i,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.4f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> * temp + ..."</span>,</span>
<span id="cb11-26">                             <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"vcnn"</span>: <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>predictions_vcnn[i,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.4f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> + ... + </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>predictions_vcnn[i,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>]<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:.4f}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;"> * temp + ..."</span>}</span>
<span id="cb11-27"></span>
<span id="cb11-28"></span>
<span id="cb11-29"></span>
<span id="cb11-30"></span>
<span id="cb11-31">result_dict_json <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> json.dumps(result_dict)</span>
<span id="cb11-32"></span>
<span id="cb11-33">html_template <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">f"""</span></span>
<span id="cb11-34"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;!DOCTYPE html&gt;</span></span>
<span id="cb11-35"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;html&gt;</span></span>
<span id="cb11-36"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;head&gt;</span></span>
<span id="cb11-37"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;title&gt;Model Selector&lt;/title&gt;</span></span>
<span id="cb11-38"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;/head&gt;</span></span>
<span id="cb11-39"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;body&gt;</span></span>
<span id="cb11-40"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;div style="text-align:center;"&gt;</span></span>
<span id="cb11-41"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;label for="hour"&gt;Hour:&lt;/label&gt;</span></span>
<span id="cb11-42"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;input type="range" id="hour" name="hour" min="1" max="24" value="12" oninput="updateModel()"&gt;</span></span>
<span id="cb11-43"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;span id="hourValue"&gt;12&lt;/span&gt;&lt;br&gt;</span></span>
<span id="cb11-44"></span>
<span id="cb11-45"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;label for="month"&gt;Month:&lt;/label&gt;</span></span>
<span id="cb11-46"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;input type="range" id="month" name="month" min="1" max="12" value="8" oninput="updateModel()"&gt;</span></span>
<span id="cb11-47"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;span id="monthValue"&gt;6&lt;/span&gt;&lt;br&gt;</span></span>
<span id="cb11-48"></span>
<span id="cb11-49"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;label for="weekday"&gt;Weekday:&lt;/label&gt;</span></span>
<span id="cb11-50"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;input type="range" id="weekday" name="weekday" min="1" max="7" value="3" oninput="updateModel()"&gt;</span></span>
<span id="cb11-51"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;span id="weekdayValue"&gt;3&lt;/span&gt;&lt;br&gt;</span></span>
<span id="cb11-52"></span>
<span id="cb11-53"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;p&gt;Varying Coefficient Boosting: &lt;span id="vcbOutput"&gt;&lt;/span&gt;&lt;/p&gt;</span></span>
<span id="cb11-54"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;p&gt;Varying Coefficient Neural Network: &lt;span id="vcnnOutput"&gt;&lt;/span&gt;&lt;/p&gt;</span></span>
<span id="cb11-55"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;/div&gt;</span></span>
<span id="cb11-56"></span>
<span id="cb11-57"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;script&gt;</span></span>
<span id="cb11-58"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">        var modelData = </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{</span>result_dict_json<span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">}</span><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">;</span></span>
<span id="cb11-59"></span>
<span id="cb11-60"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">        function updateModel() </span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">{{</span></span>
<span id="cb11-61"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            var hour = document.getElementById('hour').value;</span></span>
<span id="cb11-62"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            var month = document.getElementById('month').value;</span></span>
<span id="cb11-63"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            var weekday = document.getElementById('weekday').value;</span></span>
<span id="cb11-64"></span>
<span id="cb11-65"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            document.getElementById('hourValue').textContent = hour;</span></span>
<span id="cb11-66"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            document.getElementById('monthValue').textContent = month;</span></span>
<span id="cb11-67"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            document.getElementById('weekdayValue').textContent = weekday;</span></span>
<span id="cb11-68"></span>
<span id="cb11-69"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            var vcbModelString = modelData[hour + ',' + month + ',' + weekday]['vcb'];</span></span>
<span id="cb11-70"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            document.getElementById('vcbOutput').textContent = vcbModelString || 'No model available';</span></span>
<span id="cb11-71"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            var vcnnModelString = modelData[hour + ',' + month + ',' + weekday]['vcnn'];</span></span>
<span id="cb11-72"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">            document.getElementById('vcnnOutput').textContent = vcnnModelString || 'No model available';</span></span>
<span id="cb11-73"></span>
<span id="cb11-74"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">        </span><span class="ch" style="color: #20794D;
background-color: null;
font-style: inherit;">}}</span></span>
<span id="cb11-75"></span>
<span id="cb11-76"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">        // Initial update</span></span>
<span id="cb11-77"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">        updateModel();</span></span>
<span id="cb11-78"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">    &lt;/script&gt;</span></span>
<span id="cb11-79"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;/body&gt;</span></span>
<span id="cb11-80"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">&lt;/html&gt;</span></span>
<span id="cb11-81"><span class="ss" style="color: #20794D;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb11-82"></span>
<span id="cb11-83"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">with</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">open</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'interactive_model_selector.html'</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'w'</span>) <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> f:</span>
<span id="cb11-84">    f.write(html_template)</span></code></pre></div>
</div>
<p>We find that the Varying Coefficient Network predict a negative <code>temp</code> coefficient at <code>hour=12, month=8, weekday=3</code>. On the contrary, the Varying Coefficient Boosting model appears to mostly predict a positive coefficient. This might be a good occassion to ask a domain expert for their opinion. Then, we could fine-tune the varying coefficients by, for example, requiring the coefficient to be either all-negative or all-positive.</p>
</section>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p>[1] Hastie, Trevor; Tishbirani, Robert. Varying-coefficient models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 1993</p>
<p>[2] Yue, Mu; LI, Jialiang; Cheng, Ming-Yen. Two-step sparse boosting for high-dimensional longitudinal data with varying coefficients. Computational Statistics &amp; Data Analysis, 2019</p>
<p>[3] Zhou, Yichen; Hooker, Giles. Decision tree boosted varying coefficient models. Data Mining and Knowledge Discovery, 2022</p>


</section>

 ]]></description>
  <category>Decision Trees</category>
  <category>Gradient Boosting</category>
  <guid>https://www.sarem-seitz.com/posts/varying-coefficient-boosting-for-geospatial-and-temporal-data.html</guid>
  <pubDate>Wed, 10 May 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Winning with Simple, not even Linear Time-Series Models</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Disclaimer: Title heavily inspired by <a href="https://www.youtube.com/watch?v=68ABAU_V8qI&amp;pp=ygUad2lubmluZyB3aXRoIHNpbXBsZSBtb2RlbHM%3D&amp;ref=sarem-seitz.com">this</a> great talk.</p>
<p>As the name implies, today we want to consider almost trivially simple models. Although the current trend points towards complex models, even for time-series models, I am still a big believer in simplicity. In particular, when your dataset is small, the subsequent ideas might be useful.</p>
<p>To be fair, this article will probably be most valuable for people who are just starting out with time-series analysis. Anyone else should check the table of contents first and decide for themselves if they want to continue.</p>
<p>Personally, I am still quite intrigued by how far you can push even the most simplistic time-series models. The upcoming paragraphs show some ideas and thoughts that I have been gathering on the topic over time.</p>
</section>
<section id="models-with-pure-i.i.d.-noise" class="level2">
<h2 class="anchored" data-anchor-id="models-with-pure-i.i.d.-noise">Models with pure i.i.d. noise</h2>
<p>We start with the simplest (probabilistic) way to model a (univariate) time-series. Namely, we want to look at plain <em>i</em>ndependently, <em>i</em>dentically, <em>d</em>istributed randomness: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t=%5Cepsilon_t%20%5C%5C%0A%5Cepsilon_t%20%5Csim%5Ctext%7B%20some%20distribution%7D%0A%5Cend%7Bgathered%7D%0A"> This implies that all our observations follow the same distribution at any point in time (<strong>identically</strong> distributed). Even more importantly, we presume no interrelation between observations at all (<strong>independently</strong> distributed). Obviously, this precludes any autoregressive terms as well.</p>
<p>Probably your first question is if such models aren’t too simplistic to be useful for real-world problems. Certainly, most time-series are unlikely to have no statistical relationship with their own past.</p>
<p>While those concerns are true by all means, we can nevertheless deduce the following:</p>
<blockquote class="blockquote">
<p>Any time-series model that is more complex than a pure-noise model should also produce better forecasts than a pure-noise model.</p>
</blockquote>
<p>In short, we can at least use random noise as a benchmark model. There is arguably no simpler approach to create baseline benchmarks than this one. Even smoothing techniques will likely require more parameters to be fitted.</p>
<p>Besides this rather obvious use-case, there is another potential application for i.i.d. noise. Due to their simplicity, noise models cand be useful for very small datasets. Consider this: If big, complex models require large datasets to prevent overfitting, then simple models require only a handful of data.</p>
<p>Of course, it is debatable what dataset size can be seen as ‘small’.</p>
</section>
<section id="integrated-i.i.d.-noise" class="level2">
<h2 class="anchored" data-anchor-id="integrated-i.i.d.-noise">Integrated i.i.d. noise</h2>
<p>Now, things are becoming more interesting. While raw i.i.d. noise cannot account for auto-correlation between observations, integrated noise can. Before we do a demonstration, let us introduce the differencing operator: <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta_s%20y_t=y_t-y_%7Bt-s%7D%0A"> If you haven’t heard about differencing for time-series problems yet - great! If you have, then you can hopefully still learn something new.</p>
<section id="definition-of-an-integrated-time-series" class="level3">
<h3 class="anchored" data-anchor-id="definition-of-an-integrated-time-series">Definition of an integrated time-series</h3>
<p>With the difference operator in our toolbox, we can now define an integrated time-series</p>
<blockquote class="blockquote">
<p><em>A time-series <img src="https://latex.codecogs.com/png.latex?y_t"> is said to be integrated of order <img src="https://latex.codecogs.com/png.latex?p"> with seasonality <img src="https://latex.codecogs.com/png.latex?s"> if <img src="https://latex.codecogs.com/png.latex?%5CDelta_s%5Ep%20y_t"> is a stationary time-series.</em></p>
</blockquote>
<p>There are several ideas in this definition that we should clarify further:</p>
<p>First, you probably noticed the concept of exponentiating the difference operator. You can simply think of this as performing the differentiation several times. For the squared difference operator, this would look as follows: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5CDelta_1%5E2%20y_t=%5CDelta_1%5Cleft(%5CDelta_1%20y_t%5Cright)=%5CDelta_1%5Cleft(y_t-y_%7Bt-1%7D%5Cright)%20%5C%5C%0A=%5Cleft(y_t-y_%7Bt-1%7D%5Cright)-%5Cleft(y_%7Bt-1%7D-y_%7Bt-2%7D%5Cright)%20%5C%5C%0A%5CDelta_1%20%5CDelta_1%20y_t%20.%0A%5Cend%7Bgathered%7D%0A"> As we will see, multiple difference operators allow us to handle different time-series patterns at once.</p>
<p>Third, it is common convention to simply write <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%20y_t%20%5Ctext%20%7B%20if%20%7D%20s=1%20%5Ctext%20%7B.%20%7D%0A"> We will happily adopt this convention here. Also, we call such time-series simply integrated without referencing its order or seasonality.</p>
<p>Obviously, we also need to re-transform a difference representation back to its original domain. In our notation, this means we invert the difference transformation, i.e. <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%5E%7B-1%7D%20%5CDelta%20y_t=y_t%0A"> must hold for arbitrary difference transformations. If we expand this formula, we get <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5CDelta%5E%7B-1%7D%20%5CDelta%20y_t=y_t%20%5C%5C%0A%5CLeftrightarrow%20%5CDelta%5E%7B-1%7D%20%5CDelta%20y_t=y_t%20%5C%5C%0A%5CLeftrightarrow%20%5CDelta%5E%7B-1%7D%5Cleft(y_t-y_%7Bt-1%7D%5Cright)=y_t%20%5C%5C%0A%5CLeftrightarrow%5Cleft(y_t-y_%7Bt-1%7D%5Cright)=%5CDelta%20y_t%20%5C%5C%0A%5CLeftrightarrow%20y_t=y_%7Bt-1%7D+%5CDelta%20y_t%0A%5Cend%7Bgathered%7D%0A"> These simplifications follow from the fact the difference operator is a linear operator (we won’t cover the details here). Technically, the last equation merely says that the next observation is a sum of this observation plus a delta.</p>
<p>In a forecasting problem, we will typically have a prediction for the change <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%20y_t.%0A"> Let’s denote this prediction as <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%20%5Chat%7By%7D_t%0A"> to stress that it is not the actual change, but a predicted one. Thus, the forecast for the integrated time-series is <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7By%7D_t=y_%7Bt-1%7D+%5CDelta%20%5Chat%7By%7D_t.%0A"> Afterwards, we apply this logic recursively as far into the future as our forecast should go: <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7By%7D_%7Bt+h%7D=%5Chat%7By%7D_%7Bt+h-1%7D+%5CDelta%20%5Chat%7By%7D_%7Bt+h%7D%0A"> <img src="https://www.sarem-seitz.com/images/winning-with-simple-not-even-linear-time-series-models/nonintegrated_vs_integrated.png" class="img-fluid" alt="White noise time-series (left) and corresponding integrated time-series (right)."></p>
</section>
</section>
<section id="integrated-noise-for-seemingly-complex-patterns" class="level2">
<h2 class="anchored" data-anchor-id="integrated-noise-for-seemingly-complex-patterns">Integrated noise for seemingly complex patterns</h2>
<p>By now, you can probably imagine what is meant by an integrated noise model. In fact, we can come up with countless variants of an integrated noise model by just chaining some difference operators with random noise.</p>
<section id="linear-trends-from-integrated-time-series" class="level3">
<h3 class="anchored" data-anchor-id="linear-trends-from-integrated-time-series">Linear trends from integrated time-series</h3>
<p>One possibility would be a simply integrated time-series, i.e.</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5CDelta%20y_t=%5Cepsilon_t%5C%5C%0A%5Cepsilon_t%20%5Csim%5Ctext%7B%20some%20distribution%7D%0A%5Cend%7Bgathered%7D%0A"> It is an interesting exercise to simulate data from such a model using a plain standard normal distribution.</p>
<p>As it turns out, samples from this time-series appear to exhibit linear trends with potential change points. However, it is clear that these trends and change points occur completely at random.</p>
<p>This implies that simply fitting piece-wise linear functions to forecast such trends can be a dangerous approach. After all, if the changes are occurring at random, then all linear trend lines are mere artifacts of the random data-generating process.</p>
<p>As an important disclaimer, though, ‘unpredictable’ means unpredictable from the time-series itself. An external feature might still be able to accurately forecast potential change points. Here, however, we presume that the time-series is our solely available source of information.</p>
<p>Below, you can see an example of the described phenomenon. While there appears to be a trend change at around t=50, this change is purely random. The upward trend after t=50 also stalls at around t=60. Imagine how your model would have performed if you extrapolated the upward trend after t=60.</p>
<div id="cell-6" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"></span>
<span id="cb1-4">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">321</span>)</span>
<span id="cb1-5"></span>
<span id="cb1-6">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb1-7">plt.plot(np.cumsum(np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb1-8">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb1-9">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Of course, the saying goes ‘never say never’, even in those settings. However, you should really know what you are doing if you apply such models.</p>
</section>
<section id="seasonal-patterns" class="level3">
<h3 class="anchored" data-anchor-id="seasonal-patterns">Seasonal patterns</h3>
<p>Similarly to how a simple integration produceds trends, we can also create seasonal patterns:</p>
<p>Formally, we now need the s-th difference of our seasonal process to be a stationary process, e.g. <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5CDelta_s%20y_t=%5Cepsilon_t%5C%5C%0A%5Cepsilon_t%20%5Csim%5Ctext%7B%20some%20distribution%7D%0A%5Cend%7Bgathered%7D%0A"> The inverse operation - transforming the i.i.d. process back to the seasonally integrated - works similarly to the one before: <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7By%7D_t=y_%7Bt-s%7D+%5CDelta_s%20%5Chat%7By%7D_t%0A"> You can think of the inverse operation of seasonal differencing as a cumsum operation over s periods. Since I am not aware of a respective, native Python function, I decided to do <code>reshape-&gt;cumsum-&gt;reshape</code> to get the desired outcome. Below is an example with <img src="https://latex.codecogs.com/png.latex?s=4">:</p>
<div id="cell-8" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb2-3"></span>
<span id="cb2-4">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">321</span>)</span>
<span id="cb2-5"></span>
<span id="cb2-6">white_noise <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>))</span>
<span id="cb2-7">seasonal_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cumsum(white_noise.reshape((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">25</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>)),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-8"></span>
<span id="cb2-9">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb2-10">plt.plot(seasonal_series,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb2-11">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-12">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>As you can see, the generated time-series looks reasonably realistic. We could easily sell this as quarterly sales numbers of some product to an unsuspecting Data Scientist.</p>
<p>We could even combine both types of integration to generate a seasonal time-series with trending behavior:</p>
<div id="cell-10" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb3-2"></span>
<span id="cb3-3">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb3-4"></span>
<span id="cb3-5">white_noise <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">240</span>))</span>
<span id="cb3-6">integrated_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cumsum(np.cumsum(white_noise.reshape((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb3-7"></span>
<span id="cb3-8">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb3-9">plt.plot(integrated_series,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb3-10">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb3-11">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>At this point, you will probably realize that the title of this article was a little click-baity. Integrated time-series are, in fact, purely linear models. However, I believe that most people wouldn’t consider a model with, more-or-less, zero parameters a typical linear model.</p>
</section>
<section id="memory-effects-through-integration" class="level3">
<h3 class="anchored" data-anchor-id="memory-effects-through-integration">Memory effects through integration</h3>
<p>Another interesting property of integrated time-series is the ability to model memory effects.</p>
<p>This effect can be seen particularly well when there are larger shocks or outliers in our data. Consider the below example, which shows seasonal integration of order <img src="https://latex.codecogs.com/png.latex?s=12"> over i.i.d. draws from a standard <a href="https://en.wikipedia.org/wiki/Cauchy_distribution?ref=sarem-seitz.com">Cauchy distribution</a>:</p>
<div id="cell-12" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb4-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb4-3"></span>
<span id="cb4-4">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb4-5"></span>
<span id="cb4-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#cauchy distribution is equivalent to a Student-T with 1 degree of freedom</span></span>
<span id="cb4-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#see https://stats.stackexchange.com/questions/151854/a-normal-divided-by-the-sqrt-chi2s-s-gives-you-a-t-distribution-proof</span></span>
<span id="cb4-8">heavy_tailed_noise <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">120</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>np.sqrt(np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">120</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-9">seasonal_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cumsum(heavy_tailed_noise.reshape((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-10"></span>
<span id="cb4-11">fig, axs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">24</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>), nrows<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, ncols<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-12"></span>
<span id="cb4-13"></span>
<span id="cb4-14">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].plot(heavy_tailed_noise,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb4-15">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb4-16">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-17"></span>
<span id="cb4-18">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].plot(seasonal_series,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb4-19">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb4-20">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-5-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The first large shock in the i.i.d. Cauchy series at around t=20 is sustained over the whole integrated series on the right. Over time, more shocks occur, which are also sustained.</p>
<p>This memory property can be very useful in practice. For example, the economic shocks from the pandemic have caused persistent changes in many time-series.</p>
</section>
</section>
<section id="benchmarking-against-nbeats-and-nhits" class="level2">
<h2 class="anchored" data-anchor-id="benchmarking-against-nbeats-and-nhits">Benchmarking against NBEATS and NHITS</h2>
<p>Let us now use the AirPassengers dataset from Nixtla’s <a href="https://github.com/Nixtla/neuralforecast?ref=sarem-seitz.com">neuralforecast</a> for a quick evaluation of the above ideas. If you are regularly reading my articles, you might remember the general procedure from <a href="https://www.sarem-seitz.com/facebook-prophet-covid-and-why-i-dont-trust-the-prophet/#:~:text=even%20simpler%20forecast-,model,-As%20you%20might">this one</a>.</p>
<p>First, we split the data into a train and test period, with the latter consisting of 36 months of data:</p>
<div id="cell-15" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb5-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> neuralforecast.utils <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> AirPassengersDF</span>
<span id="cb5-3"></span>
<span id="cb5-4">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> AirPassengersDF.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]</span>
<span id="cb5-5">df.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"date"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sales"</span>]</span>
<span id="cb5-6">df.index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.to_datetime(df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"date"</span>])</span>
<span id="cb5-7"></span>
<span id="cb5-8">sales <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sales"</span>]</span>
<span id="cb5-9"></span>
<span id="cb5-10">train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sales.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">36</span>]</span>
<span id="cb5-11">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sales.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">36</span>:]</span>
<span id="cb5-12"></span>
<span id="cb5-13"></span>
<span id="cb5-14">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb5-15">plt.plot(train,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train"</span>)</span>
<span id="cb5-16">plt.plot(test,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test"</span>)</span>
<span id="cb5-17">plt.legend()</span>
<span id="cb5-18">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb5-19">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>In order to obtain a stationary, i.i.d. series we perform the following transformation: <img src="https://latex.codecogs.com/png.latex?%0A%5Cepsilon_t=%5CDelta%20%5CDelta_%7B12%7D%20%5Csqrt%7By_t%7D%0A"> First, the square-root stabilizes the increasing variance. The two differencing operators then remove seasonality and trend. For the respective re-transformation, check the code further down below.</p>
<div id="cell-17" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1">rooted <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(train)</span>
<span id="cb6-2">diffed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> rooted.diff(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb6-3">diffed_s <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> diffed.diff(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>).dropna()</span>
<span id="cb6-4"></span>
<span id="cb6-5">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb6-6">plt.plot(diffed_s,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train stationary"</span>)</span>
<span id="cb6-7">plt.legend()</span>
<span id="cb6-8"></span>
<span id="cb6-9">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-10">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-7-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>We can also check a histogram and density plot of the stabilized time-series:</p>
<div id="cell-19" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> gaussian_kde</span>
<span id="cb7-2"></span>
<span id="cb7-3">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb7-4">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb7-5">plt.hist(diffed_s,bins<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>,density <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Histogram of diffed time-series"</span>)</span>
<span id="cb7-6"></span>
<span id="cb7-7">kde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gaussian_kde(diffed_s)</span>
<span id="cb7-8"></span>
<span id="cb7-9">target_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(diffed_s)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(diffed_s)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,num<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb7-10"></span>
<span id="cb7-11">plt.plot(target_range, kde.pdf(target_range),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gaussian Kernel Density of diffed time-series"</span>)</span>
<span id="cb7-12"></span>
<span id="cb7-13">plt.legend()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-8-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Our stationary series looks also somewhat normally distributed, which is always a nice property.</p>
<p>Now, let us create the forecast for the test period. Presuming that we don’t know the exact distribution of our i.i.d. series, we simply draw from the empirical distribution via the training data. Hence, we simulate future values by reintegrating random samples from the empirical data:</p>
<div id="cell-21" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">full_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [] </span>
<span id="cb8-2"></span>
<span id="cb8-3">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb8-4"></span>
<span id="cb8-5"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>):</span>
<span id="cb8-6">    draw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.choice(diffed_s,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))</span>
<span id="cb8-7">    result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(diffed.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:].values)</span>
<span id="cb8-8"></span>
<span id="cb8-9">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test)):</span>
<span id="cb8-10">        result.append(result[t]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>draw[t])</span>
<span id="cb8-11"></span>
<span id="cb8-12">    full_sample.append(np.array(((rooted.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(result[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:]))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb8-13"></span>
<span id="cb8-14">    </span>
<span id="cb8-15">reshaped <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate(full_sample,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-16">result_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(reshaped,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-17">lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(reshaped,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-18">upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(reshaped,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-19"></span>
<span id="cb8-20"></span>
<span id="cb8-21">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">14</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb8-22">plt.plot(train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb8-23">plt.plot(test, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span>
<span id="cb8-24">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb8-25"></span>
<span id="cb8-26">plt.plot(test.index, result_mean,label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Simple model forecast"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>)</span>
<span id="cb8-27">plt.legend()</span>
<span id="cb8-28">plt.fill_between(test.index,lower,upper,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models_files/figure-html/cell-9-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This looks very good - the mean forecast is very close to the test data. In addition, our simulation allows us to empirically sample the whole forecast distribution. Therefore, we can also easily add confidence intervals.</p>
<p>Finally, let us see how our approach compares against rather complex time-series models. To do so, I went with Nixtla’s implementation of <a href="https://arxiv.org/abs/1905.10437?ref=sarem-seitz.com">NBEATS</a> and <a href="https://arxiv.org/abs/2201.12886?ref=sarem-seitz.com">NHITS</a>:</p>
<div id="cell-23" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> copy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> deepcopy</span>
<span id="cb9-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> neuralforecast <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> NeuralForecast</span>
<span id="cb9-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> neuralforecast.models <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> NBEATS, NHITS</span>
<span id="cb9-4"></span>
<span id="cb9-5">train_nxt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(train).reset_index()</span>
<span id="cb9-6">train_nxt.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ds"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>]</span>
<span id="cb9-7">train_nxt[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"unique_id"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.ones(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(train))</span>
<span id="cb9-8"></span>
<span id="cb9-9">test_nxt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(test).reset_index()</span>
<span id="cb9-10">test_nxt.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ds"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>]</span>
<span id="cb9-11">test_nxt[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"unique_id"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.ones(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))</span>
<span id="cb9-12"></span>
<span id="cb9-13"></span>
<span id="cb9-14">horizon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test_nxt)</span>
<span id="cb9-15">models <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [NBEATS(input_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> horizon, h<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>horizon,max_epochs<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>),</span>
<span id="cb9-16">          NHITS(input_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> horizon, h<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>horizon,max_epochs<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>)]</span>
<span id="cb9-17"></span>
<span id="cb9-18">nf <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> NeuralForecast(models<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>models, freq<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'M'</span>)</span>
<span id="cb9-19">nf.fit(df<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>train_nxt)</span>
<span id="cb9-20">Y_hat_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> nf.predict().reset_index()</span>
<span id="cb9-21"></span>
<span id="cb9-22"></span>
<span id="cb9-23">nbeats <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Y_hat_df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"NBEATS"</span>]</span>
<span id="cb9-24">nhits <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Y_hat_df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"NHITS"</span>]</span>
<span id="cb9-25"></span>
<span id="cb9-26"></span>
<span id="cb9-27">rmse_simple <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((test.values<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>result_mean)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-28">rmse_nbeats <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((test.values<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>nbeats.values)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)) </span>
<span id="cb9-29">rmse_nhits <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((test.values<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>nhits.values)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-30"></span>
<span id="cb9-31">pd.DataFrame([rmse_simple,rmse_nbeats,rmse_nhits], index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Simple"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"NBEATS"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"NHITS"</span>], columns<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"RMSE"</span>])</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>Global seed set to 1
Global seed set to 1</code></pre>
</div>
<div class="cell-output cell-output-display">
<script type="application/vnd.jupyter.widget-view+json">
{"model_id":"","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"}
</script>
</div>
<div class="cell-output cell-output-display">
<script type="application/vnd.jupyter.widget-view+json">
{"model_id":"318fa056f624493386b3e018db781b6b","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"}
</script>
</div>
<div class="cell-output cell-output-display">
<script type="application/vnd.jupyter.widget-view+json">
{"model_id":"","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"}
</script>
</div>
<div class="cell-output cell-output-display">
<script type="application/vnd.jupyter.widget-view+json">
{"model_id":"d8a609a48fb64b978237440c791921d4","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"}
</script>
</div>
<div class="cell-output cell-output-display">
<script type="application/vnd.jupyter.widget-view+json">
{"model_id":"c42f4467135648d2bbdf2b8a24512126","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"}
</script>
</div>
<div class="cell-output cell-output-display">
<script type="application/vnd.jupyter.widget-view+json">
{"model_id":"04aeace27d0f4e68b5b0e072a774012b","version_major":2,"version_minor":0,"quarto_mimetype":"application/vnd.jupyter.widget-view+json"}
</script>
</div>
<div class="cell-output cell-output-display" data-execution_count="11">
<div>


<table class="dataframe caption-top table table-sm table-striped small" data-quarto-postprocess="true" data-border="1">
<thead>
<tr class="header">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">RMSE</th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td data-quarto-table-cell-role="th">Simple</td>
<td>25.502159</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">NBEATS</td>
<td>44.069832</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">NHITS</td>
<td>62.713951</td>
</tr>
</tbody>
</table>

</div>
</div>
</div>
<p>As we can see, our almost trivial model has beaten two sophisticated time-series models by a fair margin. Of course, we need to emphasize that this doesn’t allow to draw any general conclusions.</p>
<p>Rather, I’d expect the neural models to outperform our simple approach for larger datasets. Nevertheless, as a benchmark, those trivial models are always a worthwhile consideration.</p>
</section>
<section id="takeaways---what-do-we-make-of-this" class="level2">
<h2 class="anchored" data-anchor-id="takeaways---what-do-we-make-of-this">Takeaways - What do we make of this?</h2>
<p>As stated multiple times throughout this article:</p>
<blockquote class="blockquote">
<p>A seemingly complex time-series could still follow a fairly simple data-generating process.</p>
</blockquote>
<p>In the end, you might spend hours trying to fit an overly complex model even though the underlying problem is almost trivial. At some point, somebody could come along, fit a simple ARIMA(1,0,0), and still outperform your sophisticated neural model.</p>
<p>To avoid the above worst-case scenario, consider the following idea:</p>
<blockquote class="blockquote">
<p>When starting out with a new time-series problem, always start with the simplest possible model and use it as a benchmark for all other models.</p>
</blockquote>
<p>Although this is common knowledge in the Data Science community, I feel like it deserves particular emphasis in this context. Especially due to nowadays’ (to some extent justified) hype around Deep Learning, it can be tempting to directly start with something fancy.</p>
<p>For many problems, this might just be the right way to go. Nobody today would consider a Hidden Markov Model for NLP today when LLM embeddings are available almost for free now.</p>
<p>Once your time-series becomes large, however, modern Machine Learning will likely be better. In particular, <a href="https://lightgbm.readthedocs.io/en/v3.3.2/?ref=sarem-seitz.com">Gradient Boosted Trees</a> are very popular for such large-scale problems.</p>
<p>A more controversial approach would be, you guessed it, Deep Learning for time-series. While some people believe that these models don’t work as well here, their popularity at <a href="https://www.amazon.science/videos-and-tutorials/forecasting-big-time-series-theory-and-practice?ref=sarem-seitz.com">tech firms like Amazon</a> probably speaks for itself.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Hamilton, James Douglas. Time series analysis. Princeton university press, 2020.</p>
<p><strong>[2]</strong> Hyndman, Rob J., &amp; Athanasopoulos, George. Forecasting: principles and practice. OTexts, 2018.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/winning-with-simple-not-even-linear-time-series-models.html</guid>
  <pubDate>Wed, 10 May 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Varying Coefficient GARCH</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/varying-coefficient-garch.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>As you can probably tell by my other articles (for example <a href="https://www.sarem-seitz.com/random-forests-and-boosting-for-arch-like-volatility-forecasts/">here</a>, <a href="https://www.sarem-seitz.com/multivariate-garch-with-python-and-tensorflow/">here</a> and <a href="https://www.sarem-seitz.com/lets-make-garch-more-flexible-with-normalizing-flows/">here</a>), I am a big fan of GARCH models. Forecasting conditional variance is arguably the best we can get in predicting stock returns out of themselves.</p>
<p>Still, the GARCH family is no silver bullet that suddenly makes you a stock wizard. Countless variations imply that there is no single best approach to handle conditional variance.</p>
<p>Today, let us look at one interesting variant of GARCH - namely, <strong>Varying Coefficient GARCH</strong>. If you are in a hurry, you can find the <strong>Jupyter notebook</strong> corresponding to this article <a href="https://github.com/SaremS/sample_notebooks/blob/master/Varying%20Coefficient%20GARCH.ipynb?ref=sarem-seitz.com">here</a>.</p>
</section>
<section id="drawbacks-of-traditional-garch" class="level2">
<h2 class="anchored" data-anchor-id="drawbacks-of-traditional-garch">Drawbacks of traditional GARCH</h2>
<p>First, we’ll quickly go through some limitations of the standard GARCH model. Although we have discussed them before, it’s always good to refresh important aspects of our models.</p>
<p>For simplicity, we will only go through <code>GARCH(1,1)</code>. The generalized version just uses an arbitrary number of lags for both squared observations and variance. Also, we assume a constant mean of zero.</p>
<p>With that in mind, <code>GARCH(1,1)</code> follows the following equations: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5Csigma_t%5E2%5Cright)%20%5C%5C%0A%5Csigma_t%5E2=%5Comega+%5Calpha%20y_%7Bt-1%7D%5E2+%5Cbeta%20%5Csigma_%7Bt-1%7D%5E2%20%5C%5C%0A%5Comega%3E0%20%5C%5C%0A0%3C%5Calpha+%5Cbeta%3C1%0A%5Cend%7Bgathered%7D%0A"> From these, we can derive two important issues:</p>
<section id="the-linear-assumption-of-garch" class="level3">
<h3 class="anchored" data-anchor-id="the-linear-assumption-of-garch">The linear assumption of GARCH</h3>
<p>GARCH makes the relatively light assumption of variance being a linear combination of past data. On the one hand, this goes very well with Occam’s razor. Simpler models are very often more robust.</p>
<p>One observation I often make when experimenting with more flexible GARCH models is overfitting. Consider a very bad probabilistic model for some data. If you allow variance to be very flexible, you just need to make that variance very large. Then, all of your training observations will still achieve a reasonably well likelihood or model fit.</p>
<div id="cell-4" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Distributions</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Printf</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Plots</span></span>
<span id="cb1-2"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb1-3"></span>
<span id="cb1-4">data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb1-5"></span>
<span id="cb1-6">dist1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-7">dist2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>)</span>
<span id="cb1-8"></span>
<span id="cb1-9">ll1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logpdf</span>.(dist1,data))</span>
<span id="cb1-10">ll2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logpdf</span>.(dist2,data))</span>
<span id="cb1-11"></span>
<span id="cb1-12">line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>)</span>
<span id="cb1-13"></span>
<span id="cb1-14">p1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(line, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(dist1,line), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model density"</span>, size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), </span>
<span id="cb1-15">          ylim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.02</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@sprintf</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model LogLikelihood: %.3f"</span> ll1)</span>
<span id="cb1-16"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(p1, line,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),line),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"True density"</span>)</span>
<span id="cb1-17"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter!</span>(p1, data, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sampled data"</span>)</span>
<span id="cb1-18"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vline!</span>(p1, [<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>dash, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model distribution mean"</span>)</span>
<span id="cb1-19"></span>
<span id="cb1-20">p2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(line, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(dist2,line), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model density"</span>, size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), </span>
<span id="cb1-21">          ylim<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.02</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@sprintf</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model LogLikelihood: %.3f"</span> ll2)</span>
<span id="cb1-22"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(p2, line,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),line),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"True density"</span>)</span>
<span id="cb1-23"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter!</span>(p2, data, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sampled data"</span>)</span>
<span id="cb1-24"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vline!</span>(p2, [<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>dash, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model distribution mean"</span>)</span>
<span id="cb1-25"></span>
<span id="cb1-26"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(p1,p2, size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="2">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-garch_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Thus, the linearity assumption guarantees a sensible amount of model regularization. On the other hand, this might nevertheless be too restrictive when linearity is clearly not present in the data.</p>
</section>
<section id="the-gaussian-assumption" class="level3">
<h3 class="anchored" data-anchor-id="the-gaussian-assumption">The Gaussian assumption</h3>
<p>Probably the most common assumption of all foundational statistical and econometrical models. In standard GARCH, we presume Gaussian observations as well. The only difference to your standard time-series models is that we are predicting variance, not the mean.</p>
<p>As mentioned countless times before, real-world data is almost never Gaussian. This is particularly the case for stock market returns, where GARCH is probably used the most. Hence, a fundamental assumption of our model stands in conflict with real-world observations.</p>
<p>In practice, the Gaussian distribution is often replaced by something more flexible. Examples include the location-scale <a href="https://en.wikipedia.org/wiki/Student%27s_t-distribution?ref=sarem-seitz.com">Student-t</a> distribution or the <a href="https://en.wikipedia.org/wiki/Generalized_beta_distribution?ref=sarem-seitz.com">Generalized Beta distribution</a>.</p>
<p>Today, we are considering the linearity issue of GARCH. For a possible treatment of the Gaussian issue, you can also take a look at <a href="https://www.sarem-seitz.com/lets-make-garch-more-flexible-with-normalizing-flows/">this article</a> of mine.</p>
</section>
</section>
<section id="motivation-for-varying-coefficient-garch" class="level2">
<h2 class="anchored" data-anchor-id="motivation-for-varying-coefficient-garch">Motivation for Varying Coefficient GARCH</h2>
<p>As already mentioned, the linearity assumption can be limiting. The obvious fix would be to just use some non-linear alternative and call it a day. However, there are two issues with that. Let us start with the bigger one first:</p>
<p><strong>It is difficult to prove stability of non-linear models</strong>. Depending on our particular model, it can be tricky, if not impossible to ensure that it is well behaved. At worst, we could see the forecast going completely bonkers over time.</p>
<p>With a plain linear model and some fundamental theory, it is straightforward to ensure that this doesn’t happen. Using an arbitrary model, though, this advantage can easily vanish.</p>
<p>The second issue is that <strong>standard non-linear models are hard to interpret</strong>. Consider again the standard GARCH setup: We can easily reason about the effect of each ‘factor’ in our model. Without further ado, we could also include additional factors like company sector, etc. in our model.</p>
<p>Obviously, this is not possible with an arbitrary, non-linear alternative anymore. Thus, considering both the above issues, a varying coefficient model becomes quite attractive.</p>
<section id="varying-coefficient-models" class="level3">
<h3 class="anchored" data-anchor-id="varying-coefficient-models">Varying coefficient models</h3>
<p>The straightforward rationale of varying coefficient models is the following: If fixed parameters are restrictive, why not just make them dynamic?</p>
<p>And indeed, this is what varying coefficient models are all about. Our primary goal is to move from static coefficient to ones that are dynamic given different inputs.</p>
<p>In a linear regression model, this could look as follows: <img src="https://latex.codecogs.com/png.latex?%0Ay=X%20%5Cbeta(X)%0A"> The coefficients are simply a function of the inputs. Thus, each input can have a unique set of linear model parameters. This allows us to model non-linear functions in a - locally - linear fashion.</p>
<p>For standard regression models there exists a lot of previous research dating back into the nineties. Nowadays, we also see some modern approaches to varying coefficient models.</p>
<p>One example is <a href="https://proceedings.neurips.cc/paper/2018/file/3e9f0fc9b2f89e043bc6233994dfcf76-Paper.pdf?ref=sarem-seitz.com">this fairly recent paper</a> which uses neural networks to model varying coefficients on a large scale. There, the approach is used for image classification. The findings are quite impressive - the model is able to highlight reasonable image sections that are most relevant for the model output.</p>
<p>Obviously, there also exists previous work on varying coefficient GARCH already, see for example</p>
<ul>
<li><a href="https://www.degruyter.com/document/doi/10.1515/snde-2019-0091/html?lang=de&amp;ref=sarem-seitz.com">Here</a></li>
<li><a href="https://www.scirp.org/pdf/ojs_2022062914594783.pdf?ref=sarem-seitz.com">Here</a></li>
<li><a href="https://link.springer.com/chapter/10.1007/978-3-540-71297-8_7?ref=sarem-seitz.com">Here</a></li>
</ul>
<p>To keep things easy for now, we’ll use a fairly simplistic variant. From there, you can try different variations yourself.</p>
</section>
</section>
<section id="varying-coefficient-garch" class="level2">
<h2 class="anchored" data-anchor-id="varying-coefficient-garch">Varying Coefficient GARCH</h2>
<p>Let us re-state the standard <code>GARCH(1,1)</code> equations from before: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5Csigma_t%5E2%5Cright)%20%5C%5C%0A%5Csigma_t%5E2=%5Comega+%5Calpha%20y_%7Bt-1%7D%5E2+%5Cbeta%20%5Csigma_%7Bt-1%7D%5E2%20%5C%5C%0A%5Comega%3E0%20%5C%5C%0A0%3C%5Calpha+%5Cbeta%3C1%0A%5Cend%7Bgathered%7D%0A"> An alternative with varying coefficients then makes the coefficients a function of some variables of interest: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5Csigma_t%5E2%5Cright)%20%5C%5C%0A%5Csigma_t%5E2=%5Comega+%5Calpha_t%20y_%7Bt-1%7D%5E2+%5Cbeta_t%20%5Csigma_%7Bt-1%7D%5E2%20%5C%5C%0A%5Calpha_t=%5Cphi%5Cleft(y_%7Bt-1%7D,%20%5Csigma_%7Bt-1%7D%5Cright)%20%5C%5C%0A%5Cbeta_t=%5Cpsi%5Cleft(y_%7Bt-1%7D,%20%5Csigma_%7Bt-1%7D%5Cright)%0A%5Cend%7Bgathered%7D%0A"> Let us, again for simplicity, use a feedforward neural network to model the varying coefficients:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/varying-coefficient-garch/varcoeffgarch_structure.png" class="img-fluid figure-img" alt="Structural diagram of a varying coefficient GARCH model."></p>
<figcaption>Structural diagram of a varying coefficient GARCH model. The coefficients vary as a Neural Network function of past realizations and past standard deviation.</figcaption>
</figure>
</div>
<p>Now, we can take advantage of linearity in the varying coefficients and ensure stationarity via <img src="https://latex.codecogs.com/png.latex?%0A0%3C%5Calpha_t+%5Cbeta_t%3C1%0A"> for all <img src="https://latex.codecogs.com/png.latex?t"></p>
<p>For static coefficient GARCH, it is relatively easy to ensure this stationarity condition. Just use one of the popular optimization packages and enter the respective linear constraint.</p>
<p>In the varying coefficient case this might, at first sight, not be as obvious. Libraries for neural networks don’t allow any constraints on the network output out-of-box. Thus, we need to build a solution to this problem ourselves.</p>
<p>For GARCH(1,1), this is actually very simple. We only need to find a transformation of our network output, that ensures stationarity. In fact, we can simply do the following: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A&amp;%5Cbegin%7Bgathered%7D%0A%5Calpha_t=%5Csigma%5Cleft(%5Ctilde%7B%5Calpha%7D_t%5Cright)%20%5C%5C%0A%5Cbeta_t=%5Cleft(1-%5Csigma%5Cleft(%5Ctilde%7B%5Calpha%7D_t%5Cright)%5Cright)%20%5Ccdot%20%5Csigma%5Cleft(%5Ctilde%7B%5Cbeta%7D_t%5Cright)%0A%5Cend%7Bgathered%7D%5C%5C%0A&amp;%5Csigma(x)=%5Cfrac%7B1%7D%7B1+e%5E%7B-x%7D%7D%0A%5Cend%7Baligned%7D%0A"> By playing around with the sigmoid function, we can quickly find a suitable transformation. For arbitrary GARCH order, this can be more of a hassle, so we won’t consider this case here.</p>
<section id="some-rationale-behind-the-model" class="level3">
<h3 class="anchored" data-anchor-id="some-rationale-behind-the-model">Some rationale behind the model</h3>
<p>You might wonder why we don’t make the parameters dependent on <strong>squared</strong> past realizations and past variance. For the latter, it shouldn’t make that much of a difference if we used variance instead of standard deviation.</p>
<p>For past observations, though, there is a clear advantage: Negative and positive realizations of the time-series can affect future variance differently. In the stock market example, large negative returns are likely to cause greater variance in subsequent periods. This is, for example, the philosophy behind the <a href="https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity?ref=sarem-seitz.com#:~:text=.-,TGARCH,-model%5Bedit">TGARCH</a> model.</p>
<p>As for the neural network, we could also use other popular function approximators. This includes Regression Splines or higher order polynomials. Given their still undefeated popularity, I decided that neural networks would be the most interesting choice.</p>
</section>
</section>
<section id="a-quick-demonstration-of-varying-coefficient-garch" class="level2">
<h2 class="anchored" data-anchor-id="a-quick-demonstration-of-varying-coefficient-garch">A quick demonstration of varying coefficient GARCH</h2>
<p>In other GARCH articles, I have primarily used Python for the implementation. Today, let us use Julia for fun and education. Personally, I find the language much more efficient for some quick experiments. On the other hand, deployment is less of a pleasure if you need to manage the JIT overhead.</p>
<p>We begin with the usual data loading process. Here, I used 5 years of DAX data from <a href="https://finance.yahoo.com/?ref=sarem-seitz.com">yahoo finance</a> and calculated the log-returns of the adjusted close price. The final 100 observations are kept in a holdout set:</p>
<div id="cell-9" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Plots</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">CSV</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">DataFrames</span></span>
<span id="cb2-2"></span>
<span id="cb2-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#https://de.finance.yahoo.com/quote/%5EGDAXI/history?period1=1515801600&amp;period2=1673568000&amp;interval=1d&amp;filter=history&amp;frequency=1d&amp;includeAdjustedClose=true</span></span>
<span id="cb2-4">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CSV.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">File</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/GDAXI.csv"</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> DataFrame</span>
<span id="cb2-5"></span>
<span id="cb2-6">a_close_raw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[!,[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Adj Close"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Date"</span>]]</span>
<span id="cb2-7">a_close_nonull <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> a_close_raw[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">findall</span>(a_close_raw[!,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Adj Close"</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.!=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"null"</span>),<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb2-8">a_close <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">parse</span>.(<span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>, a_close_nonull[!,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Adj Close"</span>])</span>
<span id="cb2-9"></span>
<span id="cb2-10">returns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">diff</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>.(a_close))</span>
<span id="cb2-11"></span>
<span id="cb2-12">train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> returns[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">99</span>]</span>
<span id="cb2-13">train_idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> a_close_nonull[!,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Date"</span>][<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">99</span>]</span>
<span id="cb2-14"></span>
<span id="cb2-15">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> returns[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">99</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>]</span>
<span id="cb2-16">test_idx <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> a_close_nonull[!,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Date"</span>][<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">99</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>]</span>
<span id="cb2-17"></span>
<span id="cb2-18"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(train_idx,train, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^GDAXI daily returns - Train"</span>, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>), fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb2-19"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(test_idx, test, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^GDAXI daily returns - Test"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="4">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-garch_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Next, we create our varying coefficient GARCH. As Julia is more or less a functional language, there aren’t any classes, unlike Python.</p>
<p>Also, we do not store the latent state of our model (= the conditional variance) over time. This is due to the fact that <a href="https://fluxml.ai/Zygote.jl/stable/?ref=sarem-seitz.com"><code>Zygote</code></a>, one of Julia’s AutoDiff libraries, doesn’t allow mutating arrays. Rather, we recursively call the respective function and pass the current state to the next function call.</p>
<p>(Technically, it is possible to store intermediate states in a <a href="https://fluxml.ai/Zygote.jl/stable/?ref=sarem-seitz.com"><code>Zygote.Buffer</code></a>. Let us here use the functional variant anyway for educational purposes.)</p>
<div id="cell-11" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Flux</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Distributions</span></span>
<span id="cb3-2"></span>
<span id="cb3-3"></span>
<span id="cb3-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">struct</span> VarCoeffGARCH</span>
<span id="cb3-5">    constant<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Vector{Float32}</span></span>
<span id="cb3-6">    net<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Chain</span></span>
<span id="cb3-7">    </span>
<span id="cb3-8">    x0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Vector{Float32}</span></span>
<span id="cb3-9"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-10">Flux.<span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@functor</span> VarCoeffGARCH</span>
<span id="cb3-11"></span>
<span id="cb3-12"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>(net<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Chain</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>([<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">9</span>], net, [<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>])</span>
<span id="cb3-13"></span>
<span id="cb3-14"></span>
<span id="cb3-15"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_mean_ll</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>, y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Vector{Float32}</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span></span>
<span id="cb3-16">    sigmas, _ <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward</span>(m,y)</span>
<span id="cb3-17">    conditional_dists <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>.(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, sigmas)</span>
<span id="cb3-18">    </span>
<span id="cb3-19">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logpdf</span>.(conditional_dists, y))</span>
<span id="cb3-20"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-21"></span>
<span id="cb3-22"></span>
<span id="cb3-23"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Use functional implementation to calculate conditional stddev.</span></span>
<span id="cb3-24"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Then, we don't need to store stddev_t to calculate stddev_t+1</span></span>
<span id="cb3-25"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#and thus avoid mutation, which doesn't work with Zygote</span></span>
<span id="cb3-26"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#(could use Zygote.Buffer, but it's often discouraged)</span></span>
<span id="cb3-27"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>, y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Vector{Float32}</span>)</span>
<span id="cb3-28">    </span>
<span id="cb3-29">    sigma_1, params_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">m</span>(m.x0[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">softplus</span>(m.constant[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])))</span>
<span id="cb3-30">    </span>
<span id="cb3-31">    sigma_rec, params_rec <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward_recurse</span>(m, sigma_1, y, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-32">    </span>
<span id="cb3-33">    sigmas_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(sigma_1, sigma_rec)</span>
<span id="cb3-34">    params_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(params_1, params_rec)</span>
<span id="cb3-35">    </span>
<span id="cb3-36">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> sigmas_result, params_result</span>
<span id="cb3-37">    </span>
<span id="cb3-38"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-39"></span>
<span id="cb3-40"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward_recurse</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>, sigma_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>, y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Vector{Float32}</span>, t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Int64</span>)</span>
<span id="cb3-41">    </span>
<span id="cb3-42">    sigma_t, params_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">m</span>(y[t], sigma_tm1)</span>
<span id="cb3-43">    </span>
<span id="cb3-44">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(y)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb3-45">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> sigma_t, params_t</span>
<span id="cb3-46">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-47">    </span>
<span id="cb3-48">    sigma_rec, params_rec <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward_recurse</span>(m, sigma_t, y, t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-49">    </span>
<span id="cb3-50">    sigmas_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(sigma_t, sigma_rec)</span>
<span id="cb3-51">    params_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(params_t, params_rec)</span>
<span id="cb3-52">    </span>
<span id="cb3-53">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> sigmas_result, params_result</span>
<span id="cb3-54"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-55"></span>
<span id="cb3-56"></span>
<span id="cb3-57"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> (m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>)(y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>, sigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>)</span>
<span id="cb3-58">    </span>
<span id="cb3-59">    input_vec <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(y, sigma)</span>
<span id="cb3-60">    </span>
<span id="cb3-61">    params <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> m.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">net</span>(input_vec)</span>
<span id="cb3-62">    params_stable <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_garch_stable_params</span>(params) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#to ensure stationarity of the resulting GARCH process</span></span>
<span id="cb3-63">        </span>
<span id="cb3-64">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">softplus</span>(m.constant[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(input_vec<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span> params_stable)), params_stable</span>
<span id="cb3-65"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-66"></span>
<span id="cb3-67"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#transform both parameters to be &gt;0 each and their sum to be &lt;1</span></span>
<span id="cb3-68"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_garch_stable_params</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Vector{Float32}</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">σ</span>(x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]), (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-σ</span>(x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]))<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*σ</span>(x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]))</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="5">
<pre><code>get_garch_stable_params (generic function with 1 method)</code></pre>
</div>
</div>
<p>Next, we create and train our varying coefficient GARCH model. Notice that I used a rather tiny architecture for the respective neural network. This hopefully counters the risk of overfitting to some extent.</p>
<p>If we were more engaged, we could experiment with different architectures. Here, however, this is left as an exercise to the reader.</p>
<div id="cell-13" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb5-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Zygote</span></span>
<span id="cb5-2"></span>
<span id="cb5-3"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb5-4"></span>
<span id="cb5-5">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Chain</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Dense</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,softplus), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Dense</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,softplus), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Dense</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)))</span>
<span id="cb5-6">params <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">params</span>(model)</span>
<span id="cb5-7"></span>
<span id="cb5-8">opt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ADAM</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>)</span>
<span id="cb5-9"></span>
<span id="cb5-10"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span></span>
<span id="cb5-11">    grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Zygote.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gradient</span>(()<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-&gt;-garch_mean_ll</span>(model, train), params)</span>
<span id="cb5-12">    Flux.Optimise.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">update!</span>(opt,params,grads)</span>
<span id="cb5-13">    </span>
<span id="cb5-14">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> i<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb5-15">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">println</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_mean_ll</span>(model,train))</span>
<span id="cb5-16">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb5-17"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>2.973238
3.0015454
3.0188503
3.0298147
3.038704
3.0551393
3.0628562
3.0676363
3.0707815
3.0733144</code></pre>
</div>
</div>
<p>Notice that we get the gradients via AutoDiff from <code>Zygote</code>. Another popular approach for GARCH models is to use black-box gradients via finite differences. Given that our neural network could easily have many more parameters, this would quickly become infeasible.</p>
<p>After model fitting, we can plot the in-sample predictions to check if everything went well:</p>
<div id="cell-15" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb7-1">sigmas, params <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward</span>(model,train)</span>
<span id="cb7-2"></span>
<span id="cb7-3">lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>.(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,sigmas),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span>
<span id="cb7-4">upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>.(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,sigmas),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)</span>
<span id="cb7-5"></span>
<span id="cb7-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(train_idx, train, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^GDAXI daily returns"</span>, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>), title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"In-Sample predictions"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb7-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(train_idx, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(lower)), ribbon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(upper,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>lower),label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"90% CI"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="7">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-garch_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This looks like a reasonable GARCH prediction for in-sample data. To see if it also works out-of-sample, we generate a forecast via MC-sampling. This is necessary as we cannot integrate out the probabilistic forecast at <img src="https://latex.codecogs.com/png.latex?t"> for <img src="https://latex.codecogs.com/png.latex?t+1"> analytically.</p>
<div id="cell-17" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb8-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward_sample</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">VarCoeffGARCH</span>, sigma_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>, y_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>, t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Int64</span>, T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Int64</span>=<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb8-2">    </span>
<span id="cb8-3">    sigma_t, params_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">m</span>(y_tm1, sigma_tm1)</span>
<span id="cb8-4">    sample_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>(<span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Float32</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma_t</span>
<span id="cb8-5">    </span>
<span id="cb8-6">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span>T</span>
<span id="cb8-7">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> sigma_t, sample_t, params_t</span>
<span id="cb8-8">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb8-9">    </span>
<span id="cb8-10">    sigma_rec, sample_rec, params_rec <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward_sample</span>(m, sigma_t, sample_t, t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, T)</span>
<span id="cb8-11">    </span>
<span id="cb8-12">    sigmas_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(sigma_t, sigma_rec)</span>
<span id="cb8-13">    sample_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(sample_t, sample_rec)</span>
<span id="cb8-14">    params_result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(params_t, params_rec)</span>
<span id="cb8-15">    </span>
<span id="cb8-16">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> sigmas_result, sample_result, params_result</span>
<span id="cb8-17">    </span>
<span id="cb8-18"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb8-19"></span>
<span id="cb8-20"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb8-21"></span>
<span id="cb8-22">mc_simulation <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">garch_forward_sample</span>(model, sigmas[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>], train[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>], <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) for _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">25000</span>]</span>
<span id="cb8-23"></span>
<span id="cb8-24">sigma_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], mc_simulation)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb8-25">y_forecast_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>], mc_simulation)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb8-26">params1_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>], mc_simulation)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb8-27">params2_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>x[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>], mc_simulation)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb8-28"></span>
<span id="cb8-29">y_forecast_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(y_forecast_sample,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb8-30">y_forecast_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">x-&gt;quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>), y_forecast_sample, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb8-31">y_forecast_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">x-&gt;quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>), y_forecast_sample, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb8-32"></span>
<span id="cb8-33"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(test[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>], size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>), title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"100 steps ahead forecast"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb8-34"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(y_forecast_mean, ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_forecast_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>y_forecast_mean, y_forecast_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>y_forecast_lower), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Forecast and 90% CI"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="8">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-garch_files/figure-html/cell-7-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Again, a reasonably looking plot. Since we also want to check if we built anything useful, let us also compare to a standard GARCH(1,1) forecast. We need to integrate out numerically once more:</p>
<div id="cell-19" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb9-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">ARCHModels</span></span>
<span id="cb9-2"></span>
<span id="cb9-3">garch_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fit</span>(GARCH{<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>}, train)</span>
<span id="cb9-4">garch_model_dummy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">fit</span>(GARCH{<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>}, train[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#to get latent variance of final training observation</span></span>
<span id="cb9-5"></span>
<span id="cb9-6"></span>
<span id="cb9-7"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb9-8"></span>
<span id="cb9-9">var_T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">predict</span>(garch_model_dummy, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>variance, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb9-10">y_T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>]</span>
<span id="cb9-11"></span>
<span id="cb9-12">garch_coefs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> garch_model.spec.coefs</span>
<span id="cb9-13">mean_coef <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> garch_model.meanspec.coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb9-14"></span>
<span id="cb9-15">garch_sigma_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">25000</span>)</span>
<span id="cb9-16">garch_forecast_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">25000</span>)</span>
<span id="cb9-17"></span>
<span id="cb9-18"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">25000</span></span>
<span id="cb9-19">    sigma_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>var_T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>]<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*</span>(y_T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>mean_coef)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb9-20">    garch_sigma_sample[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,i] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigma_1</span>
<span id="cb9-21">    </span>
<span id="cb9-22">    forecast_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma_1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>mean_coef</span>
<span id="cb9-23">    garch_forecast_sample[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,i] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> forecast_sample</span>
<span id="cb9-24">    </span>
<span id="cb9-25">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span></span>
<span id="cb9-26">        var_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> garch_sigma_sample[t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,i]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb9-27">        eps_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (garch_forecast_sample[t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,i]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>mean_coef)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb9-28">        </span>
<span id="cb9-29">        sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>(garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>var_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>eps_tm1)</span>
<span id="cb9-30">        garch_sigma_sample[t,i] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigma_t</span>
<span id="cb9-31">        </span>
<span id="cb9-32">        forecast_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>mean_coef</span>
<span id="cb9-33">        garch_forecast_sample[t,i] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> forecast_sample</span>
<span id="cb9-34">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb9-35"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb9-36">    </span>
<span id="cb9-37">garch_forecast_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(garch_forecast_sample,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb9-38">garch_forecast_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">x-&gt;quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>), garch_forecast_sample, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb9-39">garch_forecast_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">x-&gt;quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>), garch_forecast_sample, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb9-40"></span>
<span id="cb9-41"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(test[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>], size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>), title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"100 steps ahead forecast"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb9-42"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(y_forecast_mean, ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (y_forecast_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>y_forecast_mean, y_forecast_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>y_forecast_lower), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"VarCoef GARCH forecast"</span>)</span>
<span id="cb9-43"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(garch_forecast_mean, </span>
<span id="cb9-44">      ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (garch_forecast_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>garch_forecast_mean, garch_forecast_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>garch_forecast_lower),</span>
<span id="cb9-45">      label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Standard GARCH forecast"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="9">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-garch_files/figure-html/cell-8-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The standard GARCH model produces a larger forecast interval. To make both models comparable quantitatively, we use the average out-of-sample log-likelihoods:</p>
<div id="cell-21" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb10-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">KernelDensity</span></span>
<span id="cb10-2">var_coef_ll <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>([<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kde</span>(y_forecast_sample[t,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]),test[t])) for t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>])</span>
<span id="cb10-3">standard_ll <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>([<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kde</span>(garch_forecast_sample[t,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]),test[t])) for t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>])</span>
<span id="cb10-4"></span>
<span id="cb10-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">println</span>(var_coef_ll)</span>
<span id="cb10-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">println</span>(standard_ll)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>3.006017233022985
3.0003596946242705</code></pre>
</div>
</div>
<p>Our model has a slightly better out-of-sample log-likelihood. Obviously, we could likely improve this with different architectures and/or a higher GARCH model order. Just try not to overfit!</p>
<p>Finally, we can take a look at the behaviour of the varying coefficients. One interesting view is past in-sample observations against each coefficient. For comparison, I also added the fixed coefficient from standard GARCH:</p>
<div id="cell-23" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb12-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">LaTeXStrings</span></span>
<span id="cb12-2"></span>
<span id="cb12-3">title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Varying coefficient GARCH: "</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>L<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"\sigma^2_t=\omega + </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\a</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">lpha^{NN}(y_{t-1},\sigma_{t-1})y^2_{t-1}+</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\b</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">eta^{NN}(y_{t-1},\sigma_{t-1})σ^2_{t-1}"</span>, grid <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">false</span>, showaxis <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">false</span>)</span>
<span id="cb12-4"></span>
<span id="cb12-5"></span>
<span id="cb12-6">p1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter</span>(train[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>], label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none, guidefontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb12-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">xlabel!</span>(p1,L<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y_{t-1}"</span>)</span>
<span id="cb12-8"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ylabel!</span>(p1,L<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\a</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">lpha_t"</span>)</span>
<span id="cb12-9"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hline!</span>([garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>]], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Parameter in GARCH model"</span>)</span>
<span id="cb12-10"></span>
<span id="cb12-11"></span>
<span id="cb12-12">p2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter</span>(train[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>], label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none, guidefontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb12-13"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">xlabel!</span>(p2,L<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y_{t-1}"</span>)</span>
<span id="cb12-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ylabel!</span>(p2,L<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">\b</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">eta_t"</span>)</span>
<span id="cb12-15"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hline!</span>([garch_coefs[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Parameter in GARCH model"</span>)</span>
<span id="cb12-16"></span>
<span id="cb12-17"></span>
<span id="cb12-18"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(title, p1, p2, layout <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@layout</span>([A{<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>h}; [B C]]), size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>), left_margin<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>Plots.mm, bottom_margin<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>Plots.mm,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="11">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/varying-coefficient-garch_files/figure-html/cell-10-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Interestingly, both varying coefficients are in the same ballpark as standard GARCH. This adds some confidence that we are on the right track.</p>
<p>Nevertheless, as mentioned above, there is possibly still a lot of room for improvement.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Now, where do we go from here? To put it bluntly, we have yet another GARCH variation that promises to fix one limitation of standard GARCH. With a little sophistication, we get a model that is flexible and fairly transparent at the same time.</p>
<p>Now we could, for example, easily introduce external factors to our model. The current state of the general economy or the company’s sector are likely to influence return volatility.</p>
<p>With our varying coefficient GARCH, we can account for such effects. At the same time, it is possible to validate the predicted effect of each feature.</p>
<p>The biggest advantage in my opinion is, however, that we don’t have to worry about stationarity. If we restrict our model to always yield valid GARCH coefficients, there is no risk of exploding forecasts.</p>
<p>This makes this model quite powerful, yet fairly simple to handle. If you have any questions about it, please let me know.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Bollerslev, Tim. Modelling the coherence in short-run nominal exchange rates: a multivariate generalized ARCH model. The review of economics and statistics, 1990, p.&nbsp;498-505.</p>
<p><strong>[2]</strong> Donfack, Morvan Nongni; Dufays, Arnaud. Modeling time-varying parameters using artificial neural networks: a GARCH illustration. Studies in Nonlinear Dynamics &amp; Econometrics, 2021, p.&nbsp;311-343.</p>
<p><strong>[3]</strong> Hastie, Trevor; Tishbirani, Robert. Varying coefficient models. Journal of the Royal Statistical Society: Series B (Methodological), 1993, p.&nbsp;757-779.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/varying-coefficient-garch.html</guid>
  <pubDate>Thu, 19 Jan 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>When Point Forecasts Are Completely Useless</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p><a href="https://www.sarem-seitz.com/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities/">In the last article</a>, we discussed one advantage of probabilistic forecasts over point forecasts - namely, handling time-to-exceedance problems. In this post, we will examine another limitation of point forecasts: Higher order statistical properties.</p>
<p>The ideas will be very familiar to those with a background in mathematics or statistics. Readers without formal training in either will therefore probably benefit the most from this article.</p>
<p>By the end of this post, you’ll have a better idea of how higher order statistical properties can impact the performance of your forecasts. In particular, we will see how point forecasts can actually completely fail without further adjustment.</p>
</section>
<section id="two-examples-of-point-forecast-failure" class="level2">
<h2 class="anchored" data-anchor-id="two-examples-of-point-forecast-failure">Two examples of point forecast failure</h2>
<p>To sensitize you for the issues of point forecasts, let us continue with two very simple examples. Both time-series admit to a pretty simple, auto-regressive data generating process.</p>
<p>We will generate enough data for an auto-regressive Gradient Boosting model to be sensible. Thus, we avoid both using a model that is too inflexible and overfitting due to a lack of data.</p>
<section id="example-1---auto-regressive-variance-garch" class="level3">
<h3 class="anchored" data-anchor-id="example-1---auto-regressive-variance-garch">Example 1 - Auto-regressive variance (GARCH)</h3>
<div id="cell-4" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"></span>
<span id="cb1-4">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb1-5"></span>
<span id="cb1-6">time_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.random.normal()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, np.random.normal()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>]</span>
<span id="cb1-7">sigs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>]</span>
<span id="cb1-8"></span>
<span id="cb1-9"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2000</span>):</span>
<span id="cb1-10">    sig_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>time_series[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>time_series[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigs[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigs[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-11">    y_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.normal() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sig_t</span>
<span id="cb1-12">    time_series.append(y_t)</span>
<span id="cb1-13">    sigs.append(sig_t)</span>
<span id="cb1-14">    </span>
<span id="cb1-15">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(time_series[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:])</span>
<span id="cb1-16"></span>
<span id="cb1-17">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb1-18">plt.plot(y, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Simulated Time-Series"</span>)</span>
<span id="cb1-19"></span>
<span id="cb1-20">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-21">plt.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This is a standard <a href="https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity?ref=sarem-seitz.com">GARCH</a> time-series as they are frequently encountered in econometrics. If you want to get some ideas on how you can handle such data, I have also written a few articles:</p>
<ul>
<li><a href="https://www.sarem-seitz.com/random-forests-and-boosting-for-arch-like-volatility-forecasts/">Random Forests and Boosting for ARCH-like volatility forecasts</a></li>
<li><a href="https://www.sarem-seitz.com/multivariate-garch-with-python-and-tensorflow/">Multivariate GARCH with Python and Tensorflow</a></li>
<li><a href="https://www.sarem-seitz.com/lets-make-garch-more-flexible-with-normalizing-flows/">Let’s make GARCH more flexible with Normalizing Flows</a></li>
</ul>
<p>Anyway, let us use the cookie cutter approach of Machine Learning for time-series for now. Namely, we use <a href="https://github.com/Nixtla/mlforecast?ref=sarem-seitz.com">Nixtla’s mlforecast package</a> to build an auto-regressive Boosting model for us. (This is not meant to bash on the Nixtla package. In fact, it is really helpful and convenient if you know what you are doing.)</p>
<p>The results look as follows:</p>
<div id="cell-6" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mlforecast.utils <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> generate_daily_series</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.ensemble <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> GradientBoostingRegressor</span>
<span id="cb2-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mlforecast <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> MLForecast</span>
<span id="cb2-4"></span>
<span id="cb2-5">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb2-6"></span>
<span id="cb2-7">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1800</span>]</span>
<span id="cb2-8">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1800</span>:]</span>
<span id="cb2-9"></span>
<span id="cb2-10">series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> generate_daily_series(</span>
<span id="cb2-11">    n_series<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb2-12">    max_length<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>,</span>
<span id="cb2-13">).iloc[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1800</span>,:]</span>
<span id="cb2-14"></span>
<span id="cb2-15">series[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train</span>
<span id="cb2-16"></span>
<span id="cb2-17">models <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [</span>
<span id="cb2-18">    GradientBoostingRegressor(),</span>
<span id="cb2-19">]</span>
<span id="cb2-20"></span>
<span id="cb2-21">fcst <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MLForecast(</span>
<span id="cb2-22">    models<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>models,</span>
<span id="cb2-23">    freq<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'D'</span>,</span>
<span id="cb2-24">    lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]</span>
<span id="cb2-25">)</span>
<span id="cb2-26"></span>
<span id="cb2-27">fcst.fit(series, id_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'index'</span>, time_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'ds'</span>, target_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'y'</span>)</span>
<span id="cb2-28">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fcst.predict(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>)</span>
<span id="cb2-29"></span>
<span id="cb2-30"></span>
<span id="cb2-31">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb2-32"></span>
<span id="cb2-33">plt.plot(y_test, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>)</span>
<span id="cb2-34">plt.plot(predictions.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].values, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gradient Boosting forecast"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb2-35"></span>
<span id="cb2-36">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-37">plt.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Unfortunately, the result does not help at all. Although we have provided the actual ground-truth number of lags, the forecast is practically useless.</p>
</section>
<section id="example-2---auto-regressive-non-gaussian-data" class="level3">
<h3 class="anchored" data-anchor-id="example-2---auto-regressive-non-gaussian-data">Example 2 - Auto-regressive, non-Gaussian data</h3>
<p>This next example follows a more cooked up data generating process. Nevertheless, this doesn’t preclude some real-world time-series following a similar logic, too:</p>
<div id="cell-9" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> beta</span>
<span id="cb3-2"></span>
<span id="cb3-3">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">321</span>)</span>
<span id="cb3-4"></span>
<span id="cb3-5">time_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [beta(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>).rvs()]</span>
<span id="cb3-6"></span>
<span id="cb3-7"></span>
<span id="cb3-8"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2000</span>):</span>
<span id="cb3-9">    alpha_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> time_series[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> t</span>
<span id="cb3-10">    beta_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> alpha_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span></span>
<span id="cb3-11">    y_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> beta(alpha_t, beta_t).rvs()</span>
<span id="cb3-12">    time_series.append(y_t)</span>
<span id="cb3-13"></span>
<span id="cb3-14">    </span>
<span id="cb3-15">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(time_series[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:])</span>
<span id="cb3-16"></span>
<span id="cb3-17">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb3-18">plt.plot(y, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Simulated Time-Series"</span>)</span>
<span id="cb3-19"></span>
<span id="cb3-20">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb3-21">plt.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Let us check how a Gradient Boosting model performs for this case:</p>
<div id="cell-11" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mlforecast.utils <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> generate_daily_series</span>
<span id="cb4-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.ensemble <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> GradientBoostingRegressor</span>
<span id="cb4-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> mlforecast <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> MLForecast</span>
<span id="cb4-4"></span>
<span id="cb4-5">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb4-6"></span>
<span id="cb4-7">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1800</span>]</span>
<span id="cb4-8">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1800</span>:]</span>
<span id="cb4-9"></span>
<span id="cb4-10">series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> generate_daily_series(</span>
<span id="cb4-11">    n_series<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,</span>
<span id="cb4-12">    max_length<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>,</span>
<span id="cb4-13">).iloc[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1800</span>,:]</span>
<span id="cb4-14"></span>
<span id="cb4-15">series[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train</span>
<span id="cb4-16"></span>
<span id="cb4-17">models <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [</span>
<span id="cb4-18">    GradientBoostingRegressor(max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb4-19">]</span>
<span id="cb4-20"></span>
<span id="cb4-21">fcst <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MLForecast(</span>
<span id="cb4-22">    models<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>models,</span>
<span id="cb4-23">    freq<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'D'</span>,</span>
<span id="cb4-24">    lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb4-25">)</span>
<span id="cb4-26"></span>
<span id="cb4-27">fcst.fit(series, id_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'index'</span>, time_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'ds'</span>, target_col<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'y'</span>)</span>
<span id="cb4-28">predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fcst.predict(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span>)</span>
<span id="cb4-29"></span>
<span id="cb4-30"></span>
<span id="cb4-31">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb4-32"></span>
<span id="cb4-33">plt.plot(y_test, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>)</span>
<span id="cb4-34">plt.plot(predictions.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].values, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gradient Boosting forecast"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb4-35"></span>
<span id="cb4-36">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-37">plt.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-5-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Again, the forecast is utterly useless.</p>
</section>
</section>
<section id="what-has-gone-wrong-with-our-point-forecasts" class="level2">
<h2 class="anchored" data-anchor-id="what-has-gone-wrong-with-our-point-forecasts">What has gone wrong with our point forecasts?</h2>
<p>As you might know, <code>sklearn.ensemble.GradientBoostingRegressor</code> minimizes the mean-squared error (MSE) by default. The following is a well known property of MSE-minimization:</p>
<blockquote class="blockquote">
<p><em>A distribution’s mean minimizes its mean-squared error.</em></p>
</blockquote>
<p>Mathematically: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A&amp;%20%5Carg%20%5Cmin%20_%7B%5Chat%7Bf%7D%7D%20%5Cmathbb%7BE%7D%5Cleft%5B%5Cleft(y_t-%5Chat%7Bf%7D%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-s%7D%5Cright)%5Cright)%5E2%5Cright%5D%20%5C%5C%0A%5CRightarrow%20&amp;%20%5Cleft.%5Chat%7Bf%7D%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-s%7D%5Cright)%5Cright)%20%5Cequiv%20%5Cmathbb%7BE%7D%5Cleft%5By_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-s%7D%5Cright%5D%0A%5Cend%7Baligned%7D%0A"> where we presume an arbitrarily large set of admissable functions. Also, we implicitly need to assume that the conditional mean actually exists. This is reasonably likely for most well-behaved forecasting problems.</p>
<p>Thus, both of the above models aim to forecast the mean of the conditional distribution of our observations. The issue here is that the conditional mean is actually constant by construction.</p>
<p>This is obvious for the first example - each observation has a conditional mean of zero. For the second example, we would have to do some math for a formal proof that is left to the interested reader.</p>
<p>Now, although the conditional mean remains constant over time, our time-series is still far from being just pure noise. Predicting the mean via MSE-minimization was rather inadequate to describe the future.</p>
<p>We can go even further and proclaim:</p>
<blockquote class="blockquote">
<p><em>Even a perfect (point-) forecasting model can be useless if the forecast quantity is uninformative.</em></p>
</blockquote>
<p>We can visualize via plots of conditional densities against conditional means from our examples:</p>
<div id="cell-14" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> norm</span>
<span id="cb5-2"></span>
<span id="cb5-3">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb5-4">line_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>)</span>
<span id="cb5-5"></span>
<span id="cb5-6">time_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.random.normal()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, np.random.normal()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>]</span>
<span id="cb5-7">sigs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>]</span>
<span id="cb5-8">conditional_pdfs_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [norm(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>).pdf(line_1)]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb5-9">conditional_means_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.</span>]</span>
<span id="cb5-10"></span>
<span id="cb5-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2000</span>):</span>
<span id="cb5-12">    sig_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>time_series[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>time_series[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigs[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.24</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigs[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-13">    y_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.random.normal() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sig_t</span>
<span id="cb5-14">    time_series.append(y_t)</span>
<span id="cb5-15">    sigs.append(sig_t)</span>
<span id="cb5-16">    </span>
<span id="cb5-17">    conditional_pdfs_1.append(norm(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,sig_t).pdf(line_1))</span>
<span id="cb5-18">    conditional_means_1.append(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.</span>)</span>
<span id="cb5-19">    </span>
<span id="cb5-20">conditional_pdfs_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> conditional_pdfs_1[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:]</span>
<span id="cb5-21">conditional_means_1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> conditional_means_1[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:]</span>
<span id="cb5-22"></span>
<span id="cb5-23"></span>
<span id="cb5-24">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb5-25"></span>
<span id="cb5-26">line_2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>)</span>
<span id="cb5-27"></span>
<span id="cb5-28">time_series <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [beta(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>).rvs()]</span>
<span id="cb5-29">conditional_pdfs_2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [beta(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>).pdf(line_2)]</span>
<span id="cb5-30">conditional_means_2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [beta(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>).mean()]</span>
<span id="cb5-31"></span>
<span id="cb5-32"></span>
<span id="cb5-33"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2000</span>):</span>
<span id="cb5-34">    alpha_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> time_series[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> t</span>
<span id="cb5-35">    beta_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> alpha_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span></span>
<span id="cb5-36">    y_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> beta(alpha_t, beta_t).rvs()</span>
<span id="cb5-37">    time_series.append(y_t)</span>
<span id="cb5-38">    </span>
<span id="cb5-39">    conditional_pdfs_2.append(beta(alpha_t, beta_t).pdf(line_2))</span>
<span id="cb5-40">    conditional_means_2.append(beta(alpha_t, beta_t).mean())</span>
<span id="cb5-41"></span>
<span id="cb5-42">    </span>
<span id="cb5-43">conditional_pdfs_2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> conditional_pdfs_2[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]</span>
<span id="cb5-44">conditional_means_2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> conditional_means_2[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]</span>
<span id="cb5-45"></span>
<span id="cb5-46"></span>
<span id="cb5-47">_, (ax1, ax2) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb5-48"></span>
<span id="cb5-49">ax1.plot(line_1, conditional_pdfs_1[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional density, t=1"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span>
<span id="cb5-50">ax1.plot(line_1, conditional_pdfs_1[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">999</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditioanl density, t=1000"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>)</span>
<span id="cb5-51">ax1.plot(line_1, conditional_pdfs_1[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1999</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional density, t=2000"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb5-52"></span>
<span id="cb5-53">ax1.axvline([conditional_means_1[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]], lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Constant, conditional mean"</span>)</span>
<span id="cb5-54">ax1.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb5-55"></span>
<span id="cb5-56">ax1.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">11</span>)</span>
<span id="cb5-57">ax1.set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Example 1"</span>, fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb5-58"></span>
<span id="cb5-59"></span>
<span id="cb5-60">ax2.plot(line_2, conditional_pdfs_2[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional density, t=1"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span>
<span id="cb5-61">ax2.plot(line_2, conditional_pdfs_2[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">999</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditioanl density, t=1000"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>)</span>
<span id="cb5-62">ax2.plot(line_2, conditional_pdfs_2[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1999</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional density, t=2000"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb5-63"></span>
<span id="cb5-64">ax2.axvline([conditional_means_2[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]], lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Constant, conditional mean"</span>)</span>
<span id="cb5-65">ax2.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb5-66"></span>
<span id="cb5-67">ax2.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">11</span>)</span>
<span id="cb5-68">ax2.set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Example 2"</span>, fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>On the one hand, the conditional distribution is varying and can be predicted from past by construction. The conditional mean, however, is constant and does not tell us anything about the future distribution.</p>
</section>
<section id="what-can-we-do" class="level2">
<h2 class="anchored" data-anchor-id="what-can-we-do">What can we do?</h2>
<p>At first glance, the above issues paint a rather grim picture of the capabilities of raw point forecasts. As always, the situation is of course much more granular.</p>
<p>Therefore, let us discuss a rough pathway of what to do if your point forecasts aren’t really cutting it.</p>
<section id="assess-if-you-even-have-a-problem-at-all---point-forecasts-can-still-work" class="level3">
<h3 class="anchored" data-anchor-id="assess-if-you-even-have-a-problem-at-all---point-forecasts-can-still-work">Assess if you even have a problem at all - point forecasts can still work</h3>
<p>As we have just seen, point forecasts can fail miserably. Given that they are being widely used, however, indicates that they will cause trouble for your problem. Many forecasting problems can be solved reasonably well with standard approaches.</p>
<p>Sometimes, you just need to put in a little more effort into your model. Simply using another loss function or another non-linear transformation of your features might be sufficient. Once you observe that a point forecast simply won’t cut it though, it might be time to go probabilistic.</p>
<p>Two cases can be good indicators:</p>
<section id="your-point-forecasts-show-little-variation-and-are-almost-constant." class="level4">
<h4 class="anchored" data-anchor-id="your-point-forecasts-show-little-variation-and-are-almost-constant.">1) Your point forecasts show little variation and are almost constant.</h4>
<p>Mathematically: <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7By%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots%20y_%7Bt-s%7D%20%5Capprox%20C%0A"> This is what happened in our examples and should be visible in your model validation steps. As we have seen, there is no reason to conclude that something is wrong with your model or your data yet.</p>
</section>
<section id="occasional-large-outliers-frequently-make-your-point-forecasts-useless." class="level4">
<h4 class="anchored" data-anchor-id="occasional-large-outliers-frequently-make-your-point-forecasts-useless.">2) Occasional, large outliers frequently make your point forecasts useless.</h4>
<p>This issue leads us into the domain of extreme-value theory and probably deserves a blog series of its own. Hence, we will only take a brief look at what is happening here.</p>
<p>As an exaggerated, yet illustrative example, consider the following time-series:</p>
<div id="cell-17" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> cauchy</span>
<span id="cb6-2"></span>
<span id="cb6-3">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb6-4"></span>
<span id="cb6-5">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb6-6"></span>
<span id="cb6-7">plt.plot(cauchy(np.sin(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>)),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>).rvs(), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Noisy observations"</span>)</span>
<span id="cb6-8">plt.plot(np.sin(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>)), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Theoretical sine wave"</span>)</span>
<span id="cb6-9">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb6-10">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-7-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This is nothing more than samples from a <a href="https://en.wikipedia.org/wiki/Cauchy_distribution?ref=sarem-seitz.com">Cauchy distribution</a> whose location is determined by a sine. Now, let us see how the MSE evolves with increasing sample size if our (point-) forecast was just a continuance of the underlying sine:</p>
<div id="cell-19" class="cell" data-execution_count="12">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb7-2"></span>
<span id="cb7-3">T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250000</span></span>
<span id="cb7-4"></span>
<span id="cb7-5">large_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> cauchy(np.sin(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.arange(T)),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>).rvs()</span>
<span id="cb7-6">optimal_point_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.arange(T))</span>
<span id="cb7-7"></span>
<span id="cb7-8">running_mse <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.cumsum((large_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> optimal_point_forecast)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-9"></span>
<span id="cb7-10">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb7-11">plt.plot(running_mse, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Running MSE of Cauchy-Sine forecast"</span>)</span>
<span id="cb7-12">plt.grid()</span>
<span id="cb7-13">plt.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-8-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Surprisingly, the MSE doesn’t even converge after <strong>250.000</strong> (!) observations. No matter how much data you observe, your <strong>average</strong> (!) squared error keeps growing. This is a property of a certain family of probability distributions that the Cauchy is part of.</p>
<p>You will likely never observe such a monstrosity in your day-to-day life. Almost all real-world time-series adhere to certain limitations that make an infinite MSE unlikely.</p>
<p>Nevertheless, it would be helpful to get at least some idea of how likely you will observe large outliers. Imagine, for example, how valuable a rough probability of an extreme collapse of the tourism sector could have been in 2019.</p>
</section>
</section>
<section id="adjust-your-definition-of-a-useful-forecast" class="level3">
<h3 class="anchored" data-anchor-id="adjust-your-definition-of-a-useful-forecast">Adjust your definition of a ‘useful’ forecast</h3>
<p>Of course it can be very difficult to convince your stakeholders of the above issues of point forecasts. For business folks, probabilistic approaches might look like unnecessary rocket science.</p>
<p>Rather, we typically measure forecasting success by how closely predictions match future observations. If something goes wrong, just add more data and hope that you’ll be better off next time. However, consider the underlying complexity of most time-series systems. What are your chances of ever collecting all the relevant data?</p>
<p>This is like trying to collect and process all relevant factors to predict the exact outcome of a game of roulette. While possible in theory, the sheer amount of granularity makes this impossible in practice.</p>
<p>Nevertheless, you might discover that there are some physical flaws in the roulette table. If these flaws are skewing the odds in a certain direction, making your bets accordingly could make you a fortune in the long run.</p>
<p>If we transfer this analogy to general forecasting problems, this leads us to a paradigm shift:</p>
<blockquote class="blockquote">
<p><strong>Instead of trying to predict the future as exactly as possible, forecast models should optimize our odds when betting on future outcomes. </strong></p>
</blockquote>
<p>Taking this betting metaphor further, we arrive at three conclusions for forecasting:</p>
<section id="real-world-decisions-are-almost-always-made-under-uncertainty" class="level4">
<h4 class="anchored" data-anchor-id="real-world-decisions-are-almost-always-made-under-uncertainty">1) Real-world decisions are almost always made under uncertainty</h4>
<p>Consider the following problem:</p>
<p>You are an ice cream vendor and want to optimize your daily inventoy. For simplicity, we presume that each day, you either</p>
<ul>
<li>Sell exactly <code>10</code> pounds of ice-cream with a <code>90%</code> chance or</li>
<li>Sell <code>0</code> pounds with a <code>10%</code> chance (because the weather is really bad, you know)</li>
</ul>
<p>Also, presume that</p>
<ul>
<li>You can buy <code>1</code> pound of ice cream for <code>1</code> money at the beginning of each day</li>
<li>Sell <code>1</code> pound for <code>1.2</code> money</li>
<li>Your ice-cream inventory goes to zero at the end of each day (no overnight warehousing)</li>
<li>If your total losses exceed <code>-10</code> money you are going bankrupt</li>
</ul>
<p>Imagine you are building a demand forecast model for that problem to decide how much ice-cream you want to sell. If you go the point-forecast + MSE route, your result would be as follows:</p>
<p>Expected demand is <img src="https://latex.codecogs.com/png.latex?0.9%5Ccdot%2010+0.1%5Ccdot%200%20=%209">, therefore the MSE-minimizing forecast is also 9 per day. Are you going to buy <code>9</code> pounds of ice-cream each day? What about the risk of bankruptcy if you don’t sell anything multiple times in a row?</p>
<div id="cell-22" class="cell" data-execution_count="13">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb8-2"></span>
<span id="cb8-3">plt.stem([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10.</span>], [<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.9</span>], linefmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'b-'</span>, markerfmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'bo'</span>, basefmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'black'</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Probability Mass Function"</span>)</span>
<span id="cb8-4">plt.plot([<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">9</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">9</span>],[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Expected value / MSE-Minimizer"</span>)</span>
<span id="cb8-5">plt.xlabel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Ice-cream demand in pounds"</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)</span>
<span id="cb8-6">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb8-7">plt.legend(fontsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless_files/figure-html/cell-9-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This is the point where uncertainty comes into play and you need to decide on how much risk you want to take. As often in life, this is another trade-off between profit and risk.</p>
<p>Unfortunately, the point-forecast alone doesn’t account for any uncertainty.</p>
<p>Let us now presume that we had a probabilistic forecast model that was able to predict the respective probability mass function (pmf). From here, we can derive our earnings <img src="https://latex.codecogs.com/png.latex?R"> for day <img src="https://latex.codecogs.com/png.latex?t"> as a random variable given our inventory <img src="https://latex.codecogs.com/png.latex?x">: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AR%5Cleft(x_t%5Cright)=%20%5Cbegin%7Bcases%7D%5Cmin%20%5Cleft(x_t,%2010%5Cright)%20%5Ccdot%201.2-x_t%20&amp;%20%5Ctext%20%7B%20if%20%7D%20S_t=1%20%5C%5C%0A-x_t%20&amp;%20%5Ctext%20%7B%20if%20%7D%20S_t=0%5Cend%7Bcases%7D%20%5C%5C%0AS_t%20%5Cstackrel%7Bi%20.%20i%20.%20d%7D%7B%5Csim%7D%20%5Cmathcal%7BB%20e%20r%7D(0.9)%0A%5Cend%7Bgathered%7D%0A"> This information could then be used in a <a href="https://en.wikipedia.org/wiki/Stochastic_programming?ref=sarem-seitz.com#:~:text=A%20stochastic%20program%20is%20an,assumed%20to%20be%20known%20exactly.">stochastic program</a>. The latter can be seen as a probabilistic extension to deterministic optimization. Here, we can also account for and optimize our risk when dealing with real-world uncertainty.</p>
<p>In fact, real-world complexity is worlds beyond our little ice-cream example. Consider yourself what this means for the likelihood that reality will diverge from your point forecasts.</p>
</section>
<section id="many-small-bets-are-safer-than-a-few-large-ones" class="level4">
<h4 class="anchored" data-anchor-id="many-small-bets-are-safer-than-a-few-large-ones">2) Many small bets are safer than a few large ones</h4>
<p>Back to the flawed roulette table, imagine that the probability of 0 is slightly higher than expected. Would you place all your chips on 0 in a single run or place small amounts on it for many rounds?</p>
<p>If you are unlucky, even the smallest possible bet size could lead you into bankruptcy. The chances of this happening are, nevertheless, much larger if you go all-in in a single turn. While it is beyond this article to discuss proper bet sizing, the <a href="https://en.wikipedia.org/wiki/Kelly_criterion?ref=sarem-seitz.com">Kelly criterion</a> might be a useful start.</p>
<p>In practice, this could mean going from monthly forecasts to daily forecasts. That is of course a very simplistic recommendation. Subject to other factors, daily forecasts might still be less accurate or not useful at all. At this point, yours and your stakeholder’s expertise are necessary to find the right balance.</p>
</section>
<section id="sometimes-its-better-not-to-play-the-game-at-all" class="level4">
<h4 class="anchored" data-anchor-id="sometimes-its-better-not-to-play-the-game-at-all">3) Sometimes, it’s better not to play the game at all</h4>
<p>Let’s face it, there are always situations where you can only lose in the long run. If the signal-to noise ratio of your time-series is too low, it can be impossible to provide useful predictions.</p>
<p>Hedge funds with very deep pockets are <a href="https://edition.cnn.com/2019/07/10/investing/hedge-fund-drones-alternative-data/index.html?ref=sarem-seitz.com">paying absurd sums of money for alternative data</a>. All that just to make their forecasts a tiny bit more accurate than that of their competitors. Unless you have access to the same data (if it is even good at all), you are unlikely to consistently outperform them on the same bets.</p>
<p>In case you have reached this point, you might want to look for new data to improve your forecasts. If that doesn’t help either, it could even make sense to rely on respective forecasts altogether.</p>
</section>
</section>
<section id="create-multiple-point-forecasts-of-relevant-summary-statistics" class="level3">
<h3 class="anchored" data-anchor-id="create-multiple-point-forecasts-of-relevant-summary-statistics">Create multiple point forecasts of relevant summary statistics</h3>
<p>Instead of focusing on forecasting mean via MSE-minimzation (or median through MAE-minimization), <img src="https://latex.codecogs.com/png.latex?%0A%5Cmathbb%7BE%7D%5Cleft%5BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-s%7D%5Cright%5D%0A"> you could predict other quantities that describe your distribution.</p>
<p>In Example 1, the most obvious would be conditional variance <img src="https://latex.codecogs.com/png.latex?%0A%5Coperatorname%7BVar%7D%5Cleft%5BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-s%7D%5Cright%5D.%0A"> You can find a short overview on how to forecast conditional variance in <a href="https://www.sarem-seitz.com/random-forests-and-boosting-for-arch-like-volatility-forecasts/#:~:text=Estimating%20Variance%20directly">this article</a>.</p>
<p>Once your model predicts a period of high variance, you could decide to play it safer. What ‘playing it safe’ means is obviously depending on the context of your forecasting problem.</p>
<p>Example 2 might also benefit from a conditional variance forecast. However, notice that <a href="https://en.wikipedia.org/wiki/Skewness?ref=sarem-seitz.com">conditional skewness</a> is also playing a role here. One approach to deal with this situation might be a forecast of conditional quantiles, i.e. <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AF_%7BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots%20y_%7Bt-s%7D%7D%5E%7B-1%7D(%5Calpha)%20%5C%5C%0A%5Calpha%20%5Cin(0,1)%0A%5Cend%7Bgathered%7D%0A"> This is known as <a href="https://en.wikipedia.org/wiki/Quantile_regression?ref=sarem-seitz.com"><strong>quantile regression</strong></a> and, e.g., sklearn’s GradientBoostingRegressor actually implements the respective loss.</p>
<p>Which quantities you should choose will ultimately depend on your specific problem. The biggest advantage here is that you don’t make any assumptions about the underlying distribution. Rather, you just let your model ‘learn’ the important aspects of the distribution that you care about.</p>
<p>On the other hand, it will be difficult to perform stochastic optimization with this approach. After, all you just compress the most relevant information into a several point forecasts. If you want to calculate the formally best decision given some forecast, you will therefore likely have to</p>
</section>
<section id="replace-your-point-forecast-by-a-probabilistic-forecast" class="level3">
<h3 class="anchored" data-anchor-id="replace-your-point-forecast-by-a-probabilistic-forecast">Replace your point forecast by a probabilistic forecast</h3>
<p>The most challenging but also the most holistic approach. As we saw, the success of probabilistic methods often depends on the probability distribution you choose.</p>
<p>Technically, <a href="https://en.wikipedia.org/wiki/Nonparametric_statistics?ref=sarem-seitz.com">non-parametric</a> and ML methods can learn a probability distribution from the data, too. Keep in mind though, that time-series problems often involve much fewer observations than your typical ML use-case. As a result, these approaches can easily fall prey to overfitting here.</p>
<p>Especially if you are a Python user, you will probably have to implement many models yourself. Contrary to R, the Python ecosystem around forecasting seems to be much more focused on point forecasts. In case you only need a SARIMAX-like solution, <code>statsmodels</code> will, however, be your friend.</p>
<p>Below, I also summarized the three different approaches to forecasting that we have discussed so far. Keep in mind that there are advantages and disadvantages to all three.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-point-forecasts-are-completely-useless/forecasting_compared.png" class="img-fluid figure-img" alt="Different forecasting styles compared."></p>
<figcaption>Different forecasting styles compared.</figcaption>
</figure>
</div>
</section>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Hopefully, you now have a better idea of the pitfalls of point forecasts. While point forecasts are not bad per se, they just show you an incomplete picture of what is happening in an uncertain world.</p>
<p>On the other hand, probabilistic forecasts offer a much richer perspective on the future of a given time-series. If you need a sound approach to handle the uncertainty of real-world complex systems, this is the way to go. Keep in mind, though, that this route will require more manual effort in many situations.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Hamilton, James Douglas. Time series analysis. Princeton university press, 2020.</p>
<p><strong>[2]</strong> Hyndman, Rob J., &amp; Athanasopoulos, George. Forecasting: principles and practice. OTexts, 2018.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/when-point-forecasts-are-completely-useless.html</guid>
  <pubDate>Sun, 01 Jan 2023 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Why I prefer Probabilistic Forecasts - Hitting Time Probabilities</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities.html</link>
  <description><![CDATA[ 





<section id="executive-summary-by-chatgpt" class="level2">
<h2 class="anchored" data-anchor-id="executive-summary-by-chatgpt">Executive summary by <a href="https://chat.openai.com/chat?ref=sarem-seitz.com">ChatGPT</a></h2>
<p>Probabilistic forecasts are a more comprehensive way to predict future events compared to point forecasts. Probabilistic forecasts involve creating a model that predicts the entire probability distribution for a given future period, providing insight into all likely outcomes.</p>
<p>This allows for the derivation of both point and interval forecasts. Point forecasts are easier to communicate to non-technical stakeholders, but probabilistic forecasts provide a more complete picture of potential outcomes.</p>
<p>Probabilistic forecasts can also be used to answer questions about hitting times, or the first time a time-series enters a given subset of observation space. Hitting time probabilities are difficult to calculate analytically, but can be answered using Monte Carlo simulation with a probabilistic model.</p>
</section>
<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>In Data Science, forecasting often involves creating the best possible model for predicting future events. Usually, the “best” model is one that minimizes a given error metric such as the Mean-Squared Error (MSE). The end result is then a list of values that depicts the predicted trajectory of the time-series. A statistician or econometrician would call this a point forecast.</p>
<p>More traditional forecasting models, typically forecast the whole probability distribution for a given future period. We will call those probabilistic forecasts from here on.</p>
<p>One amenity of probabilistic forecasts is the ability to derive both point forecasts and interval forecasts. Think of the latter as a time-series analogue of a confidence interval applied to a forecast.</p>
<p>Certainly, a point forecast is considerably easier to communicate to non-technical stakeholders. Who wants to deal with all likely outcomes - give me a single metric to base my decisions on!</p>
<p>Now, there is definitely a real risk of overly complicated solutions ending up your company’s drawer. Nevertheless, we should not reduce complexity too much, either, just to please our non-technical end users.</p>
<p>As an example, let us take a look at hitting time problems. This is a rather uncommon topic in your standard Data Science curriculum. Nevertheless, it is quite useful.</p>
</section>
<section id="hitting-times-and-hitting-time-probabilities" class="level2">
<h2 class="anchored" data-anchor-id="hitting-times-and-hitting-time-probabilities">Hitting times and hitting time probabilities</h2>
<p>For our purposes, we go with a very intuitive definition: A hitting time is simply the first time that out time-series enters some subset of observation space. Mathematically, <img src="https://latex.codecogs.com/png.latex?%0A%5Ctau_A=%5Cinf%20_t%5Cleft%5C%7Bt,%20X_t%20%5Cin%20A%5Cright%5C%7D%0A"> where <img src="https://latex.codecogs.com/png.latex?%0AA%20%5Csubset%20%5Cmathbb%7BR%7D%0A"> and we presume that the time-series has realizations in the real numbers. The latter is not a necessary requirement but makes the problem a little more tangible.</p>
<p>One possible question we can ask is when the process exceeds a given threshold for the first time. The subset that we are interested in that case is <img src="https://latex.codecogs.com/png.latex?%0AA=%5C%7Bx,%20x%20%5Cgeq%20C%5C%7D%0A"> with <img src="https://latex.codecogs.com/png.latex?C"> the threshold of interest. Now, when we are talking about a hitting time <strong>probability</strong>, we want to know the probability distribution over the hitting time, i.e. <img src="https://latex.codecogs.com/png.latex?%0Ap_%7B%5Ctau_A%7D(T)=P%5Cleft(%5Ctau_A=T%5Cright)%0A"> For a continuous-time time-series, p is usually a probability density.. As most time-series problems in Data Science are discrete, though, let us also concentrate on that case. Consequently, p is a probability mass function which is usually easier to handle.</p>
<p>Unfortunately, hitting time probabilities are hard to calculate analytically and often intractable.</p>
<p>Luckily, a probabilistic model can answer hitting time questions via Monte-Carlo simulation. We will look at this approach further down below.</p>
<p>At first, the idea of hitting time probability might look like a nice toy problem with little practical relevance. However, consider even a simple capacity planning problem. A company might have to decide when to expand their operational capacity due to increased demand.</p>
<p>On the one hand, this can certainly be answered by a point forecast to some extent. Just pick the timestamp where you forecast exceeds the threshold as your predicted hitting time. If a point forecast was sufficient in the first place, a ‘point forecast’ of the hitting time will surely work fine, too.</p>
<p>Let us see what happens in a simple example:</p>
</section>
<section id="hitting-time-probabilities-in-the-wild" class="level2">
<h2 class="anchored" data-anchor-id="hitting-time-probabilities-in-the-wild">Hitting time probabilities in the wild</h2>
<p>To keep it simple, we use the good old <a href="https://www.kaggle.com/datasets/rakannimer/air-passengers?ref=sarem-seitz.com">Air Passengers</a> dataset. Keep in mind that a single experiment is far from sufficient to draw any generalizing conclusions.</p>
<div id="cell-6" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"></span>
<span id="cb1-4">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.read_csv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/AirPassengers.csv"</span>)</span>
<span id="cb1-5">df.index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.to_datetime(df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Month"</span>])</span>
<span id="cb1-6"></span>
<span id="cb1-7">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#Passengers"</span>]</span>
<span id="cb1-8"></span>
<span id="cb1-9">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb1-10">plt.plot(y,label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"#Passengers"</span>)</span>
<span id="cb1-11">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-12">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"AirPassengers.csv"</span>)</span>
<span id="cb1-13">plt.legend()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>While the data is heavily outdated, it is simplistic enough to help us make a point quickly.</p>
<section id="hitting-time-point-predictions" class="level3">
<h3 class="anchored" data-anchor-id="hitting-time-point-predictions">Hitting time point predictions</h3>
<p>First let us consider how to solve the hitting time problem using a standard point forecast. In the end, we can only determine when our forecast hits a certain threshold deterministically.</p>
<p>Here, I chose the arbitrary threshold of <strong>550 passengers</strong>. For a fictitious airline company behind the data, this could give an important clue for when to increase fleet capacity.</p>
<p>For the point forecast approach, the procedure is now straightforward:</p>
<ol type="1">
<li>Fit an arbitrary time-series model (Here we’ll use a SARIMAX (12,1,1,1) model to capture trend and yearly seasonality).</li>
<li>Forecast over a horizon that is sufficiently long for the time-series to exceed the given threshold.</li>
<li>Mark the timestamp where the forecast exceeds the threshold for the first time as your hitting time.</li>
</ol>
<p>With <code>statsmodels.tsa.sarimax.SARIMAX</code>, this looks as follows:</p>
<div id="cell-8" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> statsmodels.tsa.statespace.sarimax <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> SARIMAX</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb2-3"></span>
<span id="cb2-4">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">36</span>]</span>
<span id="cb2-5">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">36</span>:]</span>
<span id="cb2-6"></span>
<span id="cb2-7">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> SARIMAX(endog <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train,</span>
<span id="cb2-8">                order <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb2-9">                seasonal_order<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)).fit(disp<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-10"></span>
<span id="cb2-11"></span>
<span id="cb2-12">point_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.forecast(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">36</span>)</span>
<span id="cb2-13"></span>
<span id="cb2-14"></span>
<span id="cb2-15">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb2-16">plt.plot(y_test, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Out-of-sample data"</span>)</span>
<span id="cb2-17">plt.plot(point_forecast, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"SARIMAX point forecast"</span>)</span>
<span id="cb2-18">plt.axvline(point_forecast.index[np.argmax(point_forecast<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">550</span>)], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Point forecast hitting time"</span>)</span>
<span id="cb2-19">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-20"></span>
<span id="cb2-21">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hitting time via point forecast"</span>)</span>
<span id="cb2-22">plt.legend()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/statsmodels/tsa/base/tsa_model.py:471: ValueWarning: No frequency information was provided, so inferred frequency MS will be used.
  self._init_dates(dates, freq)
/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/statsmodels/tsa/base/tsa_model.py:471: ValueWarning: No frequency information was provided, so inferred frequency MS will be used.
  self._init_dates(dates, freq)</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities_files/figure-html/cell-3-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Looking at the out-of-sample set in hindsight, we see that our hitting time forecast was one year late. In a real-world application, being one year late could be arbitrarily bad for your business case.</p>
<p>As we will see, the probabilistic variant gives a much more complete picture. Unfortunately, we cannot calculate the respective probability mass function in closed form.</p>
</section>
<section id="monte-carlo-estimation-of-hitting-time-probabilities" class="level3">
<h3 class="anchored" data-anchor-id="monte-carlo-estimation-of-hitting-time-probabilities">Monte-Carlo estimation of hitting time probabilities</h3>
<p>Luckily, <code>statsmodels</code>’ SARIMAX provides both mean and standard deviation forecasts. As the forecast distribution is Gaussian, we can use that knowledge for a <a href="https://en.wikipedia.org/wiki/Monte_Carlo_method?ref=sarem-seitz.com">Monte-Carlo simulation</a>. From there, we can estimate the probability for each month being the hitting time for <code>C=550</code>:</p>
<div id="cell-11" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb4-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> norm</span>
<span id="cb4-3"></span>
<span id="cb4-4">means <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.get_forecast(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>).predicted_mean</span>
<span id="cb4-5">stds <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.get_forecast(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>).se_mean</span>
<span id="cb4-6"></span>
<span id="cb4-7">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb4-8">hits <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.argmax(norm(means,stds).rvs()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">550</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)]</span>
<span id="cb4-9">hit_dates <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [means.index[hit] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> hit <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> hits]</span>
<span id="cb4-10"></span>
<span id="cb4-11">probs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.Series(hit_dates).value_counts()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span></span>
<span id="cb4-12"></span>
<span id="cb4-13">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb4-14">plt.bar(probs.index, probs.values,width<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Forecasted hitting time probabilities"</span>)</span>
<span id="cb4-15"></span>
<span id="cb4-16">plt.axvline(means.index[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">int</span>(np.mean(hits))], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>,label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Approx. mean hitting time"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb4-17">plt.axvline(means.index[np.argmax(model.forecast(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">36</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">550</span>)],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Point forecast hitting time"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb4-18">plt.axvline(means.index[np.argmax(y_test<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">550</span>)],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Actual hitting time (y&gt;=550)"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb4-19">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-20"></span>
<span id="cb4-21">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hitting time probabilities via probabilistic forecast"</span>)</span>
<span id="cb4-22">plt.legend()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This looks much better. Our model predicts that the time-series will is most likely to exceed the threshold one year before the point forecast prediction. The hindsight data also agrees much better with this prediction.</p>
<p>Additionally, we see that the point forecast hitting time (red line) is not the expectation (purple line) of the probabilistic variant either. This is significant in so far as the point forecast of the actual time-series is in fact the mean of the probabilistic forecast.</p>
<p>Due to the underlying dynamic of SARIMAX, however, this does not translate to the mean hitting time.</p>
<p>Finally, let us look at the Cumulative Distribution Function of our mass function estimate:</p>
<div id="cell-13" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb5-2">plt.plot(probs.sort_index().cumsum(), lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Hitting time probability c.d.f."</span>)</span>
<span id="cb5-3">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb5-4"></span>
<span id="cb5-5">plt.legend()</span>
<span id="cb5-6">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Cumulative distribution of hitting time probabilities"</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities_files/figure-html/cell-5-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Here, the probability of threshold exceedance is already beyond 60% by the second year and not the third. Another reason why the point forecast hitting time is inappropriate.</p>
</section>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>While working with point forecasts is often more convenient, such complexity reduction can be too much in some instances. Even in this rather simple example, the ‘simple’ approach was already off by one year.</p>
<p>Certainly, your particular hitting time problem might allow you to go the straightforward route. Keep in mind, however, that you will only be able to judge quality of your forecast after the fact. By then it will obviously be too late to switch to the more sophisticated but also more holistic approach discussed above.</p>
<p>In the end, a probabilistic forecast can always be reduced to a single point-forecast. Vice-versa, this is unfortunately not the case. Personally, I can more than recommend the probabilistic route as there are many other advantages to it.</p>
<p>In future articles, I am planning to provide more insights about those other advantages. If you are interested, feel free to subscribe to get notified by then.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Bas, Esra. Basics of Probability and Stochastic Processes. Springer International Publishing, 2019.</p>
<p><strong>[2]</strong> Hamilton, James Douglas. Time series analysis. Princeton university press, 2020.</p>
<p><strong>[3]</strong> Hyndman, Rob J., and George Athanasopoulos. Forecasting: principles and practice. OTexts, 2018.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/why-i-prefer-probabilistic-forecasts-hitting-time-probabilities.html</guid>
  <pubDate>Tue, 06 Dec 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Random Forests and Boosting for ARCH-like volatility forecasts</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/random-forests-and-boosting-for-arch-like-volatility-forecasts.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>In <a href="https://www.sarem-seitz.com/forecasting-with-decision-trees-and-random-forests/">the last article</a>, we discussed how Decision Trees and Random Forests can be used for forecasting. While mean and point forecasts are the most obvious applications, they might not always be the most useful ones.</p>
<p>Consider the classic example of financial returns, where the conditional mean is hard, if not impossible, to predict. Conditional variance on the other hand has been shown to exert some auto-regressive properties that can be modelled. In fact there exist countless models from the <a href="https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity?ref=sarem-seitz.com">(G)ARCH-family</a> that enjoy widespread popularity.</p>
<p>Most GARCH models make primarily linear assumptions about the auto-regressive patterns. This begs the question if we can introduce non-linear relationships for more flexibility. Given the promising performance of tree models for mean forecasts, the roadmap is clear. Can we build respective models for variance forecasts?</p>
</section>
<section id="conditional-variance-and-modelling-options" class="level2">
<h2 class="anchored" data-anchor-id="conditional-variance-and-modelling-options">Conditional variance and modelling options</h2>
<p>Although there exists a lot of information on conditional variance models on the internet, let us quickly walk through the basics. For more insights, feel free to study the references in the <a href="https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity?ref=sarem-seitz.com">GARCH wikipedia article</a>. Alternatively, you might also get some additional insights from these two articles <a href="https://www.sarem-seitz.com/multivariate-garch-with-python-and-tensorflow/">here</a> and <a href="https://www.sarem-seitz.com/lets-make-garch-more-flexible-with-normalizing-flows/">here</a>.</p>
<p>To get started, we state our forecast target: <img src="https://latex.codecogs.com/png.latex?%0A%5Coperatorname%7BVar%7D%5Cleft(Y_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%0A"> Pretty simple - we want to predict variance of a time-series based on the series’ past realizations. There is only one problem: <strong>We do not observe the time-series’ variance</strong>.</p>
<p>Of course, this is also true when for the time-series’ mean. However, by minimizing the mean-squared-error, <a href="https://math.stackexchange.com/questions/2554243/understanding-the-mean-minimizes-the-mean-squared-error?ref=sarem-seitz.com">our model will generate predictions for the mean</a> (a.k.a. expected value). In a time-series modelling problem, this looks as follows: <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7Bf%7D%5E*=%5Cunderset%7B%5Chat%7Bf%7D%20%5Cin%20%5Cmathcal%7BM%7D%7D%7B%5Coperatorname%7Bargmin%7D%7D%20%5Cmathbb%7BE%7D%5Cleft%5B%5Cleft(%5Chat%7Bf%7D%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)-y_t%5Cright)%5E2%5Cright%5D%0A"> <img src="https://latex.codecogs.com/png.latex?%5Cmathcal%7BM%7D"> the set of all admissible models <img src="https://latex.codecogs.com/png.latex?%0A%5CRightarrow%20%5Chat%7Bf%7D%5E*%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%20%5Capprox%20%5Cmathbb%7BE%7D%5Cleft%5BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D%0A"> In words: A model that minimizes the MSE of the conditionals can be used as an estimator for the conditional mean. Of course, this requires the set of admissible models to be large enough. A linear model, for example, will fail as an approximation for a highly non-linear conditional mean.</p>
<p>Luckily, Decision Tree ensembles are very flexible and thus should make a decent set of candidate models.</p>
<p>Now hopes are up that there exists a suitable loss function that will find a similar estimator for conditional variances. Indeed there is a very convenient way:</p>
<section id="estimating-variance-directly" class="level3">
<h3 class="anchored" data-anchor-id="estimating-variance-directly">Estimating Variance directly</h3>
<p>This approach requires only a tiny bit of probability theory and a loose assumption on our data. Recall that the variance of any random variable can be decomposed as follows: <img src="https://latex.codecogs.com/png.latex?%0A%5Coperatorname%7BVar%7D%5Cleft(Y_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)=E%5Cleft%5BY_t%5E2%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D-E%5Cleft%5BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D%5E2%0A"> Notice that the expectation on the right can be estimated via the MSE minimization from before. Thus, with a respective estimator (e.g.&nbsp;Decision Tree ensemble), we can make use of the next formula: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Ctilde%7BY%7D_t%5Cleft%7Cy_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D=Y_t%5Cright%7C%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D-%5Cmathbb%7BE%7D%5Cleft%5BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D%20%5C%5C%0A%5CRightarrow%20%5Cmathbb%7BE%7D%5Cleft%5B%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D=0,%20%5C%5C%0A%5CRightarrow%20%5Coperatorname%7BVar%7D%5Cleft(%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)=%5Coperatorname%7BVar%7D%5Cleft(Y_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%0A%5Cend%7Bgathered%7D%0A"> Obviously, we don’t observe the actual conditional mean but only have our tree-based estimator at hand. Therefore: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Ctilde%7BY%7D_t%5Cleft%7Cy_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D=Y_t%5Cright%7C%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D-%5Chat%7Bf%7D%5E*%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%20%5C%5C%0A%5CRightarrow%20%5Cmathbb%7BE%7D%5Cleft%5B%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D%20%5Capprox%200,%20%5C%5C%0A%5CRightarrow%20%5Coperatorname%7BVar%7D%5Cleft(%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%20%5Capprox%20%5Coperatorname%7BVar%7D%5Cleft(Y_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%0A%5Cend%7Bgathered%7D%0A"> Plugging this back into the variance formula, we get: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Barray%7D%7Br%7D%0A%5Coperatorname%7BVar%7D%5Cleft(%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)=E%5Cleft%5B%5Ctilde%7BY%7D_t%5E2%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D-E%5Cleft%5B%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D%5E2%20%5C%5C%0A=E%5Cleft%5B%5Ctilde%7BY%7D_t%5E2%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D-0%5E2%20%5C%5C%0A%5CRightarrow%20%5Coperatorname%7BVar%7D%5Cleft(%5Ctilde%7BY%7D_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)=%5Coperatorname%7BVar%7D%5Cleft(Y_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%20%5C%5C%0A=E%5Cleft%5B%5Ctilde%7BY%7D_t%5E2%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D%0A%5Cend%7Barray%7D%0A"></p>
<p>This implies that the <strong>variance of our target variable</strong> equals the <strong>expectation of the squared mean-transformed variable</strong>.</p>
<p>Notice that we conditioned the transformed variable on the lagged original time-series. We can equally condition the transformed variable on the <strong>lagged transformed</strong> time-series. The latter is obviously much more convenient and popular. Thus, we’ll use this approach from now on.</p>
<p>Going back to our first equation, we can now easily build an estimator for this mean: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Chat%7Bf%7D%5E%7B*%20*%7D=%5Cunderset%7B%5Chat%7Bf%7D%20%5Cin%20%5Cmathcal%7BM%7D%7D%7B%5Coperatorname%7Bargmin%7D%7D%20%5Cmathbb%7BE%7D%5Cleft%5B%5Cleft(%5Chat%7Bf%7D%5Cleft(%5Ctilde%7By%7D_%7Bt-1%7D,%20%5Cldots,%20%5Ctilde%7By%7D_%7Bt-k%7D%5Cright)-%5Ctilde%7By%7D_t%5E2%5Cright)%5E2%5Cright%5D%20%5C%5C%0A%5CRightarrow%20%5Chat%7Bf%7D%5E%7B*%20*%7D%5Cleft(%5Ctilde%7By%7D_%7Bt-1%7D,%20%5Cldots,%20%5Ctilde%7By%7D_%7Bt-k%7D%5Cright)%20%5Capprox%20%5Cmathbb%7BE%7D%5Cleft%5B%5Ctilde%7BY%7D_t%5E2%20%5Cmid%20%5Ctilde%7By%7D_%7Bt-1%7D,%20%5Cldots,%20%5Ctilde%7By%7D_%7Bt-k%7D%5Cright%5D%20%5C%5C%0A%5Cquad%5Cleft(=%5Coperatorname%7BVar%7D%5Cleft(Y_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%5Cright)%0A%5Cend%7Bgathered%7D%0A"> We can now conclude the following:</p>
<blockquote class="blockquote">
<p>If we train another model to <strong>minimize MSE</strong> on the <strong>squared transformed variable</strong>, that model becomes an estimator for the conditional <strong>variance</strong>.</p>
</blockquote>
<p>Obviously, this method has one caveat: <strong>The model must only produce non-negative predictions</strong>. Otherwise, there is no guarantee that we won’t receive negative variance predictions on a test set.</p>
<p>This is where Decision Trees and Random Forests come in handy. The squared data for training is <strong>always non-negative</strong>. Therefore, a trained Decision Tree will only produce non-negative predictions, too.</p>
<p>A raw linear regression model on the other hand would not have such guarantees. Unfortunately, this also excludes Gradient Boosted Trees from the range of possible models. Standard Boosting produces a weighted sum of Decision Trees. Consequently, <strong>a single negative weight could result in negative predictions</strong>.</p>
<p>There is, nevertheless, a way to make Gradient Boosting work as well:</p>
</section>
<section id="using-an-adjusted-loss-function" class="level3">
<h3 class="anchored" data-anchor-id="using-an-adjusted-loss-function">Using an adjusted loss function</h3>
<p>Since Gradient Boosting is so powerful, it would be a bummer to not be able to use it for our problem. Luckily, most popular Boosting libraries allow to define custom loss functions. This is exactly what we will do here.</p>
<p>Consider again the mean-subtracted variable: <img src="https://latex.codecogs.com/png.latex?%0A%5Cmathbb%7BE%7D%5Cleft%5B%5Ctilde%7BY%7D_t%20%5Cmid%20%5Ctilde%7By%7D_%7Bt-1%7D,%20%5Cldots,%20%5Ctilde%7By%7D_%7Bt-k%7D%5Cright%5D%20%5Capprox%200%0A"> As a result we can impose the following distributional assumption: <img src="https://latex.codecogs.com/png.latex?%0A%5Ctilde%7BY%7D_t%20%5Cmid%20%5Ctilde%7By%7D_%7Bt-1%7D,%20%5Cldots,%20%5Ctilde%7By%7D_%7Bt-k%7D%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5Cexp%20%5Cleft(%5Chat%7Bf%7D%5Cleft(%5Ctilde%7By%7D_%7Bt-1%7D,%20%5Cldots,%20%5Ctilde%7By%7D_%7Bt-k%7D%5Cright)%5Cright)%5Cright)%0A"> The output from a Gradient Boosting algorithm itself can be negative. Adding exponentiation ensures non-negative variance. Now we simply need to translate the above into a log-likelihood loss function and calculate gradient and hessian. Then we plug everything into a Boosting algorithm: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AL%5Cleft(%5Chat%7Bf%7D_t,%20%5Ctilde%7By%7D_t%5Cright)%20%5C%5C%0A=-0.5%20%5Clog%20(2%20%5Cpi)-0.5%20%5Clog%20%5Cleft(%5Cexp%20%5Cleft(%5Chat%7Bf%7D_t%5Cright)%5Cright)-%5Cfrac%7B%5Ctilde%7By%7D_t%7B%20%7D%5E2%7D%7B2%20%5Cexp%20%5Cleft(%5Chat%7Bf%7D_t%5Cright)%7D%20%5C%5C%0A%5Ctext%20%7B%20where%20%7D%20%5Chat%7Bf%7D_t=%5Chat%7Bf%7D%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)%20%5C%5C%0A%5CRightarrow%20%5Cfrac%7B%5Cpartial%20L%5Cleft(%5Chat%7Bf%7D_t,%20%5Ctilde%7By%7D_t%5Cright)%7D%7B%5Cpartial%20%5Chat%7Bf%7D_t%7D=-0.5+%5Cfrac%7B%5Ctilde%7By%7D_t%7B%20%7D%5E2%7D%7B2%20%5Cexp%20(%5Chat%7Bf%7D)%7D%20%5C%5C%0A%5CRightarrow%20%5Cfrac%7B%5Cpartial%5E2%20L%5Cleft(%5Chat%7Bf%7D_t,%20%5Ctilde%7By%7D_t%5Cright)%7D%7B%5Cpartial%5E2%20%5Chat%7Bf%7D_t%7D=-%5Cfrac%7B%5Ctilde%7By%7D_t%5E2%7D%7B2%20%5Cexp%20(%5Chat%7Bf%7D)%7D%0A%5Cend%7Bgathered%7D%0A"> Keep in mind that we need to apply the chain-rule to our additional exponential transformation.For <a href="https://github.com/microsoft/LightGBM?ref=sarem-seitz.com">Microsoft’s LightGBM package</a>, our custom loss would now look as follows:</p>
<div id="cell-5" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#see e.g. https://hippocampus-garden.com/lgbm_custom/</span></span>
<span id="cb1-2"></span>
<span id="cb1-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> gaussian_loss(y_pred, data):</span>
<span id="cb1-4">    y_true <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data.get_label()</span>
<span id="cb1-5">    </span>
<span id="cb1-6">    loglikelihood <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>np.log(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>np.pi) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>y_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>np.exp(y_pred)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>y_true<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb1-7">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#remember that boosting minimizes the loss function but we want to maximize the loglikelihood</span></span>
<span id="cb1-8">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#thus, need to return the negative loglikelihood to the Boosting algorithm</span></span>
<span id="cb1-9">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#also applies to gradient and hessian below</span></span>
<span id="cb1-10">    </span>
<span id="cb1-11">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"loglike"</span>, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>loglikelihood, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb1-12"></span>
<span id="cb1-13"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> gaussian_loss_gradhess(y_pred, data):</span>
<span id="cb1-14">    y_true <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data.get_label()</span>
<span id="cb1-15">    </span>
<span id="cb1-16">    exp_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.exp(y_pred)</span>
<span id="cb1-17">    </span>
<span id="cb1-18">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#pay attention to the chain rule as we exp() the Boosting output before plugging it into the loglikelihood</span></span>
<span id="cb1-19">    grad <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>exp_pred<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>y_true<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> </span>
<span id="cb1-20">    hess <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>exp_pred<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>y_true<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb1-21"></span>
<span id="cb1-22">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>grad, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>hess</span></code></pre></div>
</div>
<p>Now, let us implement both approaches and compare them against benchmark models:</p>
</section>
</section>
<section id="tree-ensembles-for-volatility-forecasts-in-python" class="level2">
<h2 class="anchored" data-anchor-id="tree-ensembles-for-volatility-forecasts-in-python">Tree ensembles for volatility forecasts in Python</h2>
<p>In general, the implementation should be straightforward. Apart from avoiding careless mistakes, there is nothing too special that be need to be aware of.</p>
<p>As a dataset, we’ll use five years of standardized Dow Jones log-returns:</p>
<div id="cell-8" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb2-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> yfinance <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> yf</span>
<span id="cb2-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb2-5"></span>
<span id="cb2-6">symbol <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^DJI"</span></span>
<span id="cb2-7"></span>
<span id="cb2-8">data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> yf.download(symbol, start<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2017-10-01"</span>, end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022-10-01"</span>)</span>
<span id="cb2-9">returns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Close"</span>]).diff().dropna()</span>
<span id="cb2-10"></span>
<span id="cb2-11">n_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span></span>
<span id="cb2-12"></span>
<span id="cb2-13">train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> returns.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>n_test]</span>
<span id="cb2-14"></span>
<span id="cb2-15">train_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train.mean()</span>
<span id="cb2-16">train_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train.std()</span>
<span id="cb2-17"></span>
<span id="cb2-18">train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (train<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>train_mean)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>train_std</span>
<span id="cb2-19"></span>
<span id="cb2-20">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> returns.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>n_test:]</span>
<span id="cb2-21">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (test<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>train_mean)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>train_std <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#be careful to spill information over into the test period</span></span>
<span id="cb2-22"></span>
<span id="cb2-23">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>))</span>
<span id="cb2-24">plt.plot(train, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train data"</span>)</span>
<span id="cb2-25">plt.plot(test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test data"</span>)</span>
<span id="cb2-26">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-27">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-28">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[*********************100%***********************]  1 of 1 completed</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/random-forests-and-boosting-for-arch-like-volatility-forecasts_files/figure-html/cell-3-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Notice that we are performing a z-Normalization. This will make the subsequent GARCH model more robust to degenerate scaling of the time-series. Also, we reserve the last 30 days of data for a test set. Be careful not to introduce lookahead bias here.</p>
<p>Finally, it is reasonable to presume that conditional expected returns are always zero, i.e. <img src="https://latex.codecogs.com/png.latex?%0A%5Cmathbb%7BE%7D%5Cleft%5BY_t%20%5Cmid%20y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright%5D=0%0A"> Otherwise, somebody else would have likely discovered this anomaly before us and arbitraged it away. Therefore, we don’t need to apply the mean-subtraction in the first place. Rather, we’ll directly use raw log-returns for our variance estimation.</p>
<section id="random-forest-volatility-forecasts" class="level3">
<h3 class="anchored" data-anchor-id="random-forest-volatility-forecasts">Random Forest volatility forecasts</h3>
<p>For the Random Forest model, we use <code>[sklearn's](https://scikit-learn.org/stable/?ref=sarem-seitz.com) RandomForestRegressor</code>. To limit the risk of overfitting, we run the algorithm with a <code>max_depth=3</code>. Given the typically noisy behaviour of financial time-series, such regularization seems reasonable.</p>
<p>Finally, we set <code>k=5</code> which lets our algorithm consider the past five observations for forecasting. While we could set this lower or higher, there is a tradeoff between omitting information and overfitting. In production, you would obviously do more backtesting to find the optimal number of lags.</p>
<p>In code, we now have</p>
<div id="cell-10" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.ensemble <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> RandomForestRegressor</span>
<span id="cb4-2"></span>
<span id="cb4-3">n_lags <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span></span>
<span id="cb4-4"></span>
<span id="cb4-5">train_lagged <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([train<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>[train.shift(i) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,n_lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).dropna()</span>
<span id="cb4-6"></span>
<span id="cb4-7">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_lagged.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb4-8">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_lagged.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]</span>
<span id="cb4-9"></span>
<span id="cb4-10"></span>
<span id="cb4-11">forest_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> RandomForestRegressor(max_depth<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>, n_jobs<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, random_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb4-12">forest_model.fit(X_train.values, y_train.values)</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/2d/hl2cr85d2pb2kfbmsng3267c0000gn/T/ipykernel_70730/2717592527.py:5: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  train_lagged = pd.concat([train**2]+[train.shift(i) for i in range(1,n_lags+1)],1).dropna()</code></pre>
</div>
<div class="cell-output cell-output-display" data-execution_count="3">
<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: "▸";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: "▾";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: "";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: "";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id="sk-container-id-1" class="sk-top-container"><div class="sk-text-repr-fallback"><pre>RandomForestRegressor(max_depth=3, n_jobs=-1, random_state=123)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br>On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden=""><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-1" type="checkbox" checked=""><label for="sk-estimator-id-1" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestRegressor</label><div class="sk-toggleable__content"><pre>RandomForestRegressor(max_depth=3, n_jobs=-1, random_state=123)</pre></div></div></div></div></div>
</div>
</div>
<p>To generate forecasts, we apply Monte-Carlo sampling. Then, we use those samples to estimate the 90% forecast interval. This is necessary because our model can only predict conditional variance one step ahead. Presuming Gaussian noise, this looks as follows:</p>
<div id="cell-12" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> norm</span>
<span id="cb6-2"></span>
<span id="cb6-3">samp_size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50000</span></span>
<span id="cb6-4"></span>
<span id="cb6-5">Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(pd.concat([train.shift(i) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_lags)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).dropna().iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,:].values.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb6-6">Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([Xt <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(samp_size)])</span>
<span id="cb6-7">Xt.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train.columns</span>
<span id="cb6-8"></span>
<span id="cb6-9">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb6-10"></span>
<span id="cb6-11">forest_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb6-12"></span>
<span id="cb6-13"></span>
<span id="cb6-14"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test)):</span>
<span id="cb6-15">    pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> forest_model.predict(Xt.values).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb6-16">    samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> norm(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).rvs(samp_size).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>np.sqrt(pred)</span>
<span id="cb6-17">    forest_samples.append(samp)</span>
<span id="cb6-18"></span>
<span id="cb6-19">    Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(np.concatenate([np.array(samp).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),Xt.values[:,:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb6-20">    Xt.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train.columns</span>
<span id="cb6-21">    </span>
<span id="cb6-22">    </span>
<span id="cb6-23">forest_samples_matrix <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate(forest_samples,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb6-24"></span>
<span id="cb6-25">forest_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(forest_samples_matrix,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-26">forest_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(forest_samples_matrix,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-27">forest_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(forest_samples_matrix,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/2d/hl2cr85d2pb2kfbmsng3267c0000gn/T/ipykernel_70730/1944271020.py:5: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  Xt = pd.DataFrame(pd.concat([train.shift(i) for i in range(n_lags)],1).dropna().iloc[-1,:].values.reshape(1,-1))</code></pre>
</div>
</div>
<p>50000 samples per timestamp should suffice for now. If you need more accuracy, feel free to increase this amount by a lot.</p>
</section>
<section id="gradient-boosting-volatility-forecasts" class="level3">
<h3 class="anchored" data-anchor-id="gradient-boosting-volatility-forecasts">Gradient Boosting volatility forecasts</h3>
<p>The Gradient Boosting variant is quite similar to the Random Forest approach. We only need to take care of the LightGBM-specifics for data preprocessing. Also keep in mind that our target now needs to be the raw, z-normalized time-series, not the squared one.</p>
<div id="cell-14" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> lightgbm <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> lgb</span>
<span id="cb8-2"></span>
<span id="cb8-3">train_lagged <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([train]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>[train.shift(i) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,n_lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).dropna()</span>
<span id="cb8-4"></span>
<span id="cb8-5">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_lagged.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb8-6">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> train_lagged.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]</span>
<span id="cb8-7"></span>
<span id="cb8-8">train_data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> lgb.Dataset(X_train.values, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>y_train.values)</span>
<span id="cb8-9">param <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"num_leaves"</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"learning_rate"</span>:<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"seed"</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>}</span>
<span id="cb8-10"></span>
<span id="cb8-11">num_round <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span></span>
<span id="cb8-12">boosted_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> lgb.train(param, train_data, num_round, fobj<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>gaussian_loss_gradhess, feval<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>gaussian_loss)</span>
<span id="cb8-13"></span>
<span id="cb8-14">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb8-15"></span>
<span id="cb8-16">boosted_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb8-17"></span>
<span id="cb8-18"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test)):</span>
<span id="cb8-19">    pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> boosted_model.predict(Xt.values).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-20">    samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> norm(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).rvs(samp_size).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>np.sqrt(np.exp(pred))</span>
<span id="cb8-21">    boosted_samples.append(samp)</span>
<span id="cb8-22"></span>
<span id="cb8-23">    Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(np.concatenate([np.array(samp).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),Xt.values[:,:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb8-24">    Xt.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> X_train.columns</span>
<span id="cb8-25">    </span>
<span id="cb8-26">    </span>
<span id="cb8-27">boosted_samples_matrix <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate(boosted_samples,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-28"></span>
<span id="cb8-29">boosted_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(boosted_samples_matrix,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb8-30">boosted_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(boosted_samples_matrix,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb8-31">boosted_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(boosted_samples_matrix,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/2d/hl2cr85d2pb2kfbmsng3267c0000gn/T/ipykernel_70730/729234487.py:3: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  train_lagged = pd.concat([train]+[train.shift(i) for i in range(1,n_lags+1)],1).dropna()</code></pre>
</div>
<div class="cell-output cell-output-stdout">
<pre><code>[LightGBM] [Warning] Using self-defined objective function
[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.000495 seconds.
You can set `force_col_wise=true` to remove the overhead.
[LightGBM] [Info] Total Bins 1275
[LightGBM] [Info] Number of data points in the train set: 1223, number of used features: 5
[LightGBM] [Warning] Using self-defined objective function</code></pre>
</div>
</div>
<p>To check if our models are actually any good, let us compare them against two benchmarks:</p>
</section>
<section id="garch-and-kernel-density-benchmarks-and-evaluation" class="level3">
<h3 class="anchored" data-anchor-id="garch-and-kernel-density-benchmarks-and-evaluation">GARCH and kernel density benchmarks and evaluation</h3>
<p>Since we want to improve conditional volatility forecasts, GARCH seems to be the most obvious comparison. To match the number of lags in our tree ensembles, we’ll use a GARCH(5,5) model.</p>
<p>As a second benchmark, let us use a simple i.i.d. kernel density fit. I.e. we presume that each future observation is drawn independently from the same density as the training data:</p>
<div id="cell-17" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> arch <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> arch_model</span>
<span id="cb11-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> gaussian_kde</span>
<span id="cb11-3"></span>
<span id="cb11-4"></span>
<span id="cb11-5">am <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> arch_model(train, p<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>n_lags,q<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>n_lags)</span>
<span id="cb11-6">res <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> am.fit(update_freq<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>)</span>
<span id="cb11-7">forecasts <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> res.forecast(horizon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test), reindex<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>).variance.values[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,:]</span>
<span id="cb11-8"></span>
<span id="cb11-9">garch_samples_matrix <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> res.forecast(horizon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test), simulations <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50000</span>, reindex<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>, method <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"simulation"</span>).simulations.values[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,:,:]</span>
<span id="cb11-10"></span>
<span id="cb11-11">garch_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.std(garch_samples_matrix,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb11-12">garch_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(garch_samples_matrix,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb11-13">garch_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(garch_samples_matrix,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb11-14"></span>
<span id="cb11-15">iid_kde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gaussian_kde(train)</span>
<span id="cb11-16">iid_kde_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> iid_kde.resample((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50000</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))).reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50000</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))</span>
<span id="cb11-17"></span>
<span id="cb11-18">kde_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(iid_kde_samp,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb11-19">kde_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(iid_kde_samp,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>Iteration:      5,   Func. Count:     75,   Neg. LLF: 1688.2522911700148
Iteration:     10,   Func. Count:    145,   Neg. LLF: 1524.6402966915884
Iteration:     15,   Func. Count:    216,   Neg. LLF: 1306.756160681614
Iteration:     20,   Func. Count:    281,   Neg. LLF: 1306.5361795921172
Optimization terminated successfully    (Exit mode 0)
            Current function value: 1306.5361512322356
            Iterations: 23
            Function evaluations: 320
            Gradient evaluations: 23</code></pre>
</div>
</div>
<p>At last, the actual evaluation. As a performance measure, we want to use the out-of-sample log-likelihood - the higher the better. That way we will see which forecasted probability density performs best on our test set.</p>
<p>Since we could only sample from the forecast distributions, we’ll fit a kernel density to each of the Monte-Carlo samples. For our benchmark, we obviously have the kernel density fit from the training data already.</p>
<p>Finally, we calculate the log-likelihood for the kernel densities as a proxy:</p>
<div id="cell-19" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb13-1">benchmark_lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [iid_kde.logpdf(test[i])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))]</span>
<span id="cb13-2">garch_lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [gaussian_kde(garch_samples_matrix[:,i]).logpdf(test[i])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))]</span>
<span id="cb13-3">forest_lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [gaussian_kde(forest_samples_matrix[:,i]).logpdf(test[i])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))]</span>
<span id="cb13-4">boosted_lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [gaussian_kde(boosted_samples_matrix[:,i]).logpdf(test[i])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(test))]</span>
<span id="cb13-5"></span>
<span id="cb13-6"></span>
<span id="cb13-7">fig, (ax1,ax2,ax3,ax4) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">19</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>))</span>
<span id="cb13-8">st <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fig.suptitle(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Symbol: "</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>symbol, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>)</span>
<span id="cb13-9"></span>
<span id="cb13-10"></span>
<span id="cb13-11">ax1.plot(train.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:], color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Last 50 observations of training set"</span>)</span>
<span id="cb13-12">ax1.plot(test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>)</span>
<span id="cb13-13">ax1.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb13-14">ax1.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb13-15">ax1.fill_between(test.index, forest_lower, forest_upper, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"orange"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Random Forest ARCH - 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb13-16">ax1.legend()</span>
<span id="cb13-17">ax1.set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Random Forest ARCH - Test set loglikelihood: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(forest_lpdfs))[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>]), fontdict<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'fontsize'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>})</span>
<span id="cb13-18"></span>
<span id="cb13-19"></span>
<span id="cb13-20">ax2.plot(train.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:], color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Last 50 observations of training set"</span>)</span>
<span id="cb13-21">ax2.plot(test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>)</span>
<span id="cb13-22">ax2.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb13-23">ax2.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb13-24">ax2.fill_between(test.index, boosted_lower, boosted_upper, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"orange"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Boosted Tree ARCH - 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb13-25">ax2.legend()</span>
<span id="cb13-26">ax2.set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gradient Boosting ARCH - Test set loglikelihood: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(boosted_lpdfs))[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>]), fontdict<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'fontsize'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>})</span>
<span id="cb13-27"></span>
<span id="cb13-28"></span>
<span id="cb13-29">ax3.plot(train.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:], color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Last 50 observations of training set"</span>)</span>
<span id="cb13-30">ax3.plot(test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>)</span>
<span id="cb13-31">ax3.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb13-32">ax3.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb13-33">ax3.fill_between(test.index, garch_lower, garch_upper, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"orange"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"GARCH (5,5) - 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb13-34">ax3.legend()</span>
<span id="cb13-35">ax3.set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"GARCH(5,5)- Test set loglikelihood: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(garch_lpdfs))[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>]), fontdict<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'fontsize'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>})</span>
<span id="cb13-36"></span>
<span id="cb13-37"></span>
<span id="cb13-38">ax4.plot(train.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:], color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Last 50 observations of training set"</span>)</span>
<span id="cb13-39">ax4.plot(test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test set"</span>)</span>
<span id="cb13-40">ax4.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb13-41">ax4.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb13-42">ax4.fill_between(test.index, kde_lower, kde_upper, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"orange"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I.i.d. Kernel Density - 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb13-43">ax4.legend()</span>
<span id="cb13-44">ax4.set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"I.i.d. KDE - Test set loglikelihood: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">str</span>(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(benchmark_lpdfs))[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>]), fontdict<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'fontsize'</span>: <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>})</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="7">
<pre><code>Text(0.5, 1.0, 'I.i.d. KDE - Test set loglikelihood: -46.681')</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/random-forests-and-boosting-for-arch-like-volatility-forecasts_files/figure-html/cell-8-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>In fact, our tree models perform best for our 30 day test set. Surprisingly, the GARCH model performed worse than our kernel density benchmark. Keep in mind though that a single evaluation doesn’t allow any generalizing conclusions.</p>
<p>Nevertheless, our Random Forest and Gradient Boosting ARCH models appear to work reasonably well.</p>
</section>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>This article gave a quick demonstration of how tree ensembles can be used for volatility forecasts. Although the example models could possibly be enhanced a lot, the initial ideas already seem to work quite well.</p>
<p>Possible enhancement could be a better choice of conditional distributions. I.e. all our models, except the kernel density benchmark, worked with Gaussianity assumptions. In for financial time-series, this might obviously be a sub-optimal choice. If, for example, we wanted to account for heavy conditional tails, a <a href="https://en.wikipedia.org/wiki/Student%27s_t-distribution?ref=sarem-seitz.com">Student’s T</a>-distribution would be better suited.</p>
<p>PS: <a href="https://github.com/SaremS/sample_notebooks/blob/master/Decision%20Tree%20Ensembles%20for%20Volatility%20Forecasts.ipynb?ref=sarem-seitz.com">Notebook for this article is available on GitHub</a>.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Breiman, Leo. Random forests. Machine learning, 2001, 45.1, p.&nbsp;5-32.</p>
<p><strong>[2]</strong> Bollerslev, Tim. Modelling the coherence in short-run nominal exchange rates: a multivariate generalized ARCH model. The review of economics and statistics, 1990, p.&nbsp;498-505.</p>
<p><strong>[3]</strong> Ke, Guolin, et al.&nbsp;Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, 2017, 30</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <category>Decision Trees</category>
  <guid>https://www.sarem-seitz.com/posts/random-forests-and-boosting-for-arch-like-volatility-forecasts.html</guid>
  <pubDate>Fri, 07 Oct 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Forecasting with Decision Trees and Random Forests</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Today, Deep Learning dominates many areas of modern machine learning. <a href="https://openreview.net/forum?id=vdgtepS1pV&amp;ref=sarem-seitz.com">On the other hand, Decision Tree based models still shine particularly for tabular data</a>. If you look up the winning solutions of respective Kaggle challenges, chances are high that a tree model is among them.</p>
<p>A key advantage of tree approaches is that they typically don’t require too much fine-tuning for reasonable results. This is in stark contrast to Deep Learning. Here, different topologies and architectures can result in dramatical differences in model performance.</p>
<p>For time-series forecasting, decision trees are not as straightforward as for tabular data, though:</p>
</section>
<section id="challenges-with-trees-and-forests-for-forecasting" class="level2">
<h2 class="anchored" data-anchor-id="challenges-with-trees-and-forests-for-forecasting">Challenges with trees and forests for forecasting</h2>
<p>As you probably know, fitting any decision tree based methods requires both input and output variables. In a univariate time-series problem, however, we usually only have our time-series as a target.</p>
<p>To work around this issue, we need to augment the time-series to become suitable for tree models. Let us discuss two intuitive, yet false approaches and why they fail first. Obviously, the issues generalize to all Decision Tree ensemble methods.</p>
<section id="decision-tree-forecasting-as-regression-against-time" class="level3">
<h3 class="anchored" data-anchor-id="decision-tree-forecasting-as-regression-against-time">Decision Tree forecasting as regression against time</h3>
<p>Probably the most intuitive approach is to consider the observed time-series as a function of time itself, i.e. <img src="https://latex.codecogs.com/png.latex?%0Ay_t=f(t)+%5Cepsilon%0A"> With some i.i.d. stochastic additive error term. <a href="https://www.sarem-seitz.com/facebook-prophet-covid-and-why-i-dont-trust-the-prophet/">In an earlier article</a>, I have already made some remarks on why regression against time itself is problematic. For tree based models, there is another problem:</p>
<blockquote class="blockquote">
<p><strong>Decision Trees for regression against time cannot extrapolate into the future.</strong></p>
</blockquote>
<p>By construction, Decision Tree predictions are averages of subsets of the training dataset. These subsets are formed by splitting the space of input data into axis-parallel hyper rectangles. Then, for each hyper rectangle, we take the average of all observation outputs inside those rectangles as a prediction.</p>
<p>For regression against time, those hyper rectangles are simply splits of time intervals. More exactly, those intervals are mutually exclusive and completely exhaustive.</p>
<p>Predictions are then the arithmetic means of the time-series observations inside those intervals. Mathematically, this roughly translates to <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7Bf%7D(t)=%5Csum_%7Bi=1%7D%5EN%20%5Cfrac%7B1%7D%7B%5Cleft%7C%5Cleft%5C%7By_s,%20s%20%5Cin%5C%7B1%20;%20%5Cldots%20;%20T%5C%7D%20%5Ccap%20I_i%5Cright%5C%7D%5Cright%7C%7D%20%5Csum_%7Bs=1%7D%5ET%20y_s%20%5Ccdot%20%5Cmathbb%7BI%7D_%7Bs%20%5Cin%20I_i%7D%20%5Cmathbb%7BI%7D_%7Bt%20%5Cin%20I_i%7D%0A"> where <img src="https://latex.codecogs.com/png.latex?y_1,%20%5Cldots,%20y_T"> : training observations <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AI_i=%5Cleft(l_i,%20u_i%5Cright%5D%20%5C%5C%0Al_i,%20u_i%20%5Cin%20%5Cmathbb%7BR%7D_0%5E%7B+%7D%20;%20l_i%3Cu_i%20;%20l_1=0,%20u_N=%5Cinfty%20%5C%5C%0AI_i%20%5Ccap%20I_%7Bj%20%5Cneq%20i%7D=%5Cemptyset%20;%20%5Cbigcup_%7Bi=1%7D%5EN%20I_i=%5Cmathbb%7BR%7D_0%5E%7B+%7D%20%5C%5C%0A%5Cmathbb%7BI%7D_%7Bs%20%5Cin%20I_i%7D=%20%5Cbegin%7Bcases%7D1%20&amp;%20s%20%5Cin%20I_i%20%5C%5C%0A0%20&amp;%20%5Ctext%20%7B%20else%20%7D%5Cend%7Bcases%7D%0A%5Cend%7Bgathered%7D%0A"> Consider now using this model to predict the time-series at some time in the future. This reduces the above formula to the following: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Chat%7Bf%7D(%5Ctilde%7Bt%7D)=%5Cfrac%7B1%7D%7B%5Cleft%7C%5Cleft%5C%7By_s,%20s%20%5Cin%5C%7B1%20;%20%5Cldots%20;%20T%5C%7D%20%5Ccap%20I_T%5Cright%5C%7D%5Cright%7C%7D%20%5Csum_%7Bs=1%7D%5ET%20y_s%20%5Ccdot%20%5Cmathbb%7BI%7D_%7Bs%20%5Cin%20I_T%7D%20%5C%5C%0A%5Ctext%20%7B%20for%20all%20%7D%20%5Ctilde%7Bt%7D%20%5Cgeq%20T%0A%5Cend%7Bgathered%7D%0A"> In words: For any forecast, our model always predicts the average of the final training interval. Which is clearly useless…</p>
<p>Let us visualize this issue on a quick toy example:</p>
<div id="cell-4" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.tree <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> DecisionTreeRegressor</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-4"></span>
<span id="cb1-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#create data with linear trend</span></span>
<span id="cb1-6">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb1-7">t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb1-8">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)<span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#linear trend</span></span>
<span id="cb1-9"></span>
<span id="cb1-10">t_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-11">t_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-12"></span>
<span id="cb1-13">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>]</span>
<span id="cb1-14">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:]</span>
<span id="cb1-15"></span>
<span id="cb1-16">tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-17">tree.fit(t_train, y_train)</span>
<span id="cb1-18"></span>
<span id="cb1-19">y_pred_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(t_train)</span>
<span id="cb1-20">y_pred_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(t_test)</span>
<span id="cb1-21"></span>
<span id="cb1-22">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb1-23">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), y_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-24">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb1-25">         np.concatenate([[y_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test data"</span>, </span>
<span id="cb1-26">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-27"></span>
<span id="cb1-28"></span>
<span id="cb1-29">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), y_pred_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree insample predictions"</span>, </span>
<span id="cb1-30">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, lw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-31">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb1-32">         np.concatenate([[y_pred_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_pred_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree out-of-sample predictions"</span>, </span>
<span id="cb1-33">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-34"></span>
<span id="cb1-35">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-36">plt.axvline(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb1-37">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb1-38">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree VS. Time-Series with linear trend"</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb1-39"></span>
<span id="cb1-40">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The same issues obviously arise for seasonal patterns as well:</p>
<div id="cell-6" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#create data with seasonality </span></span>
<span id="cb2-2">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb2-3">t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb2-4">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sin(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> t) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)<span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#sine seasonality</span></span>
<span id="cb2-5"></span>
<span id="cb2-6">t_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-7">t_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-8"></span>
<span id="cb2-9">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>]</span>
<span id="cb2-10">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:]</span>
<span id="cb2-11"></span>
<span id="cb2-12">tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>)</span>
<span id="cb2-13">tree.fit(t_train, y_train)</span>
<span id="cb2-14"></span>
<span id="cb2-15">y_pred_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(t_train)</span>
<span id="cb2-16">y_pred_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(t_test)</span>
<span id="cb2-17"></span>
<span id="cb2-18">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb2-19">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), y_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-20">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb2-21">         np.concatenate([[y_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test data"</span>, </span>
<span id="cb2-22">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-23"></span>
<span id="cb2-24"></span>
<span id="cb2-25">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), y_pred_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree insample predictions"</span>, </span>
<span id="cb2-26">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, lw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb2-27">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb2-28">         np.concatenate([[y_pred_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_pred_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree out-of-sample predictions"</span>, </span>
<span id="cb2-29">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb2-30"></span>
<span id="cb2-31">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-32">plt.axvline(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb2-33">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb2-34">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree VS. Time-Series with seasonality"</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb2-35"></span>
<span id="cb2-36">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>To generalize the above in a single sentence:</p>
<blockquote class="blockquote">
<p><strong>Decision Trees fail for out-of-distribution data but in regression against time, every future point in time is out-of-distribution.</strong></p>
</blockquote>
<p>Thus, we need to find a different approach.</p>
</section>
<section id="decision-trees-for-auto-regressive-forecasting" class="level3">
<h3 class="anchored" data-anchor-id="decision-trees-for-auto-regressive-forecasting">Decision Trees for auto-regressive forecasting</h3>
<p>A far more promising approach is the auto-regressive one. Here, we simply view the future of a random variable as dependent on its past realizations. <img src="https://latex.codecogs.com/png.latex?%0Ay_t=f%5Cleft(y_%7Bt-1%7D,%20%5Cldots,%20y_%7Bt-k%7D%5Cright)+%5Cepsilon%0A"> While this approach is easier to handle than regression on time, it doesn’t come without a cost:</p>
<ol type="1">
<li><strong>The time-series must be observed at equi-distant timestamps</strong>: If your time-series is measured at random times, you cannot use this approach without further adjustments.</li>
<li><strong>The time-series should not contain missing values</strong>: For many time-series models, this requirement is not mandatory. Our Decision Tree/Random Forest forecaster, however, will require a fully observed time-series.</li>
</ol>
<p>As these caveats are common for most popular time-series approaches, they aren’t too much of an issue.</p>
<p>Now, before jumping into an example, we need to take a another look at a previously discussed issue: <strong>Tree based models can only predict within the range of training data</strong>. This implies that we cannot just fit a Decision Tree or Random Forest to model auto-regressive dependencies.</p>
<p>To exemplify this issue, let’s do another example:</p>
<div id="cell-8" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb3-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.tree <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> DecisionTreeRegressor</span>
<span id="cb3-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb3-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb3-5"></span>
<span id="cb3-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#create data with linear trend</span></span>
<span id="cb3-7">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb3-8">t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb3-9">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)<span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#linear trend</span></span>
<span id="cb3-10"></span>
<span id="cb3-11">t_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-12">t_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-13"></span>
<span id="cb3-14"></span>
<span id="cb3-15">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>]</span>
<span id="cb3-16">X_train_shift <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([pd.Series(y_train).shift(t).values.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>:,:]</span>
<span id="cb3-17">y_train_shift <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>:]</span>
<span id="cb3-18">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:]</span>
<span id="cb3-19"></span>
<span id="cb3-20">tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb3-21">tree.fit(X_train_shift, y_train_shift)</span>
<span id="cb3-22"></span>
<span id="cb3-23">y_pred_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(X_train_shift).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-24"></span>
<span id="cb3-25">Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([X_train_shift[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),np.array(y_train_shift[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]).reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-26">predictions_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb3-27"></span>
<span id="cb3-28"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y_test)):</span>
<span id="cb3-29">    pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(Xt)</span>
<span id="cb3-30">    predictions_test.append(pred[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>])</span>
<span id="cb3-31">    Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([Xt[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),np.array(pred).reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-32">    </span>
<span id="cb3-33">y_pred_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(predictions_test)</span>
<span id="cb3-34"></span>
<span id="cb3-35"></span>
<span id="cb3-36">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb3-37">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), y_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb3-38">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb3-39">         np.concatenate([[y_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test data"</span>, </span>
<span id="cb3-40">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb3-41"></span>
<span id="cb3-42"></span>
<span id="cb3-43">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>:], y_pred_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree insample predictions"</span>, </span>
<span id="cb3-44">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, lw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb3-45">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb3-46">         np.concatenate([[y_pred_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_pred_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree out-of-sample predictions"</span>, </span>
<span id="cb3-47">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb3-48"></span>
<span id="cb3-49">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb3-50">plt.axvline(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb3-51">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb3-52">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree VS. Time-Series with linear trend"</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb3-53"></span>
<span id="cb3-54">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Again, not useful at all. To fix this last issue, we need to first remove the trend. Then we can fit the model, forecast the time-series and ‘re-trend’ the forecast.</p>
<p>For de-trending, we basically have two options:</p>
<ol type="1">
<li><strong>Fit a linear trend model</strong> - here we regress the time-series against time in a linear regression model. Its predictions are then subtracted from the training data to create a stationary time-series. This removes a constant, deterministic trend.</li>
<li><strong>Use first-differences</strong> - in this approach, we transform the time-series via <a href="https://otexts.com/fpp2/stationarity.html?ref=sarem-seitz.com#:~:text=series%20is%20stationary.-,Differencing,-In%20Figure%208.1">first order differencing</a>. In addition to the deterministic trend, this approach can also remove <a href="https://stats.stackexchange.com/questions/241144/explain-what-is-meant-by-a-deterministic-and-stochastic-trend-in-relation-to-the?ref=sarem-seitz.com">stochastic trends</a>.</li>
</ol>
<p>As most time-series are driven by randomness, the second approach appears more reasonable. Thus, we now aim to forecast the transformed time-series <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%20y_t=y_t-y_%7Bt-1%7D%0A"> by an autoregressive model, i.e. <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%20y_t=f%5Cleft(%5CDelta%20y_%7Bt-1%7D,%20%5Cldots,%20%5CDelta%20y_%7Bt-k%7D%5Cright)+%5Cepsilon%0A"> Obviously, differencing and lagging remove some observations from our training data. Some care should be taken to not remove too much information that way. I.e. don’t use too many lagged variables if your dataset is small.</p>
<p>To obtain a forecast for the original time-series we need to retransform the differenced forecast via <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7By%7D_%7Bt+1%7D=%5Chat%7B%5CDelta%7D%20y_%7Bt+1%7D+y_t%0A"> and, recursively for further ahead forecasts: <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7By%7D_%7Bt+h%7D=%5Chat%7B%5CDelta%7D%20y_%7Bt+h%7D+%5Chat%7By%7D_%7Bt+h-1%7D%0A"> For our running example this finally leads to a reasonable solution:</p>
<div id="cell-10" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb4-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.tree <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> DecisionTreeRegressor</span>
<span id="cb4-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb4-4"></span>
<span id="cb4-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#create data with linear trend</span></span>
<span id="cb4-6">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb4-7">t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.arange(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb4-8">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> np.random.normal(size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)<span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#linear trend</span></span>
<span id="cb4-9"></span>
<span id="cb4-10">t_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-11">t_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-12"></span>
<span id="cb4-13">n_lags <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span></span>
<span id="cb4-14"></span>
<span id="cb4-15">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>]</span>
<span id="cb4-16">X_train_shift <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([pd.DataFrame(y_train).shift(t) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,n_lags)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).diff().values[n_lags:,:]</span>
<span id="cb4-17">y_train_shift <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.diff(y_train)[n_lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]</span>
<span id="cb4-18">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:]</span>
<span id="cb4-19"></span>
<span id="cb4-20">tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> DecisionTreeRegressor(max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-21">tree.fit(X_train_shift, y_train_shift)</span>
<span id="cb4-22"></span>
<span id="cb4-23">y_pred_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(X_train_shift).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-24"></span>
<span id="cb4-25">Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([X_train_shift[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),np.array(y_train_shift[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]).reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-26">predictions_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb4-27"></span>
<span id="cb4-28"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y_test)):</span>
<span id="cb4-29">    pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(Xt)</span>
<span id="cb4-30">    predictions_test.append(pred[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>])</span>
<span id="cb4-31">    Xt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([np.array(pred).reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),Xt[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-32">    </span>
<span id="cb4-33">y_pred_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(predictions_test)</span>
<span id="cb4-34"></span>
<span id="cb4-35">y_pred_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train[n_lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(y_pred_train)</span>
<span id="cb4-36">y_pred_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(y_pred_test)</span>
<span id="cb4-37"></span>
<span id="cb4-38"></span>
<span id="cb4-39"></span>
<span id="cb4-40">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb4-41">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), y_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-42">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb4-43">         np.concatenate([[y_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test data"</span>, </span>
<span id="cb4-44">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-45"></span>
<span id="cb4-46"></span>
<span id="cb4-47">plt.plot(t_train.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)[n_lags:], y_pred_train, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree insample predictions"</span>, </span>
<span id="cb4-48">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, lw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb4-49">plt.plot(np.concatenate([np.array(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]),t_test.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)]), </span>
<span id="cb4-50">         np.concatenate([[y_pred_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]],y_pred_test]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree out-of-sample predictions"</span>, </span>
<span id="cb4-51">         color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb4-52"></span>
<span id="cb4-53">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-54">plt.axvline(t_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"black"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb4-55">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb4-56">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Decision Tree VS. Time-Series with linear trend"</span>, fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">15</span>)</span>
<span id="cb4-57"></span>
<span id="cb4-58">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/2d/hl2cr85d2pb2kfbmsng3267c0000gn/T/ipykernel_67956/2778805950.py:16: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  X_train_shift = pd.concat([pd.DataFrame(y_train).shift(t) for t in range(1,n_lags)],1).diff().values[n_lags:,:]</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-5-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
</section>
</section>
<section id="from-trees-to-random-forecast-forecasts" class="level2">
<h2 class="anchored" data-anchor-id="from-trees-to-random-forecast-forecasts">From trees to Random Forecast forecasts</h2>
<p>Let us now apply the above approach to a real-world dataset. We use the <a href="https://www.kaggle.com/datasets/bulentsiyah/for-simple-exercises-time-series-forecasting?ref=sarem-seitz.com">alcohol sales data from the St.&nbsp;Louis Fed</a> database. For evaluation, we use the last four years as a holdout set:</p>
<div id="cell-12" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb6-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb6-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb6-4"></span>
<span id="cb6-5">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.read_csv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/Alcohol_Sales.csv"</span>)</span>
<span id="cb6-6">df.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"date"</span>, <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sales"</span>]</span>
<span id="cb6-7">df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"date"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.to_datetime(df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"date"</span>])</span>
<span id="cb6-8">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.set_index(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"date"</span>)</span>
<span id="cb6-9"></span>
<span id="cb6-10">df_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">48</span>]</span>
<span id="cb6-11">df_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">48</span>:]</span>
<span id="cb6-12"></span>
<span id="cb6-13">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>))</span>
<span id="cb6-14">plt.plot(df, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>)</span>
<span id="cb6-15">plt.plot(df_test, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test data"</span>)</span>
<span id="cb6-16">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb6-17">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-18">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Alcohol Sales"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="8">
<pre><code>Text(0.5, 1.0, 'Alcohol Sales')</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-6-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="growing-an-autoregressive-random-forest-for-forecasting" class="level2">
<h2 class="anchored" data-anchor-id="growing-an-autoregressive-random-forest-for-forecasting">Growing an autoregressive Random Forest for forecasting</h2>
<p>Since a single Decision Tree would be boring at best and inaccurate at worst, we’ll use a Random Forest instead. Besides the typical performance improvements, Random Forests allow us to generate forecast intervals.</p>
<p>To create Random Forest forecast intervals, we proceed as follows:</p>
<ol type="1">
<li><strong>Train an autoregressive Random Forest</strong>: This step is equivalent to fitting the Decision Tree as before</li>
<li><strong>Use a randomly drawn Decision Tree at each forecast step</strong>: Instead of just forest.predict(), we let a randomly drawn, single Decision Tree perform the forecast. By repeating this step multiple times, we create a sample of Decision Tree forecasts.</li>
<li><strong>Calculate quantities of interest from the Decision Tree sample</strong>: This could range from median to standard deviation or more complex targets. We are primarily interested in a mean forecast and the 90% predictive interval.</li>
</ol>
<p>The following Python class does everything we need:</p>
<div id="cell-14" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> sklearn.ensemble <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> RandomForestRegressor</span>
<span id="cb8-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> copy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> deepcopy</span>
<span id="cb8-3"></span>
<span id="cb8-4"></span>
<span id="cb8-5"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> RandomForestARModel():</span>
<span id="cb8-6">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Autoregressive forecasting with Random Forests</span></span>
<span id="cb8-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb8-9">    </span>
<span id="cb8-10">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, n_lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, n_estimators<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>, random_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>,</span>
<span id="cb8-11">                 log_transform <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>, first_differences <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>, seasonal_differences <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>):</span>
<span id="cb8-12">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb8-14"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_lags: Number of lagged features to consider in autoregressive model</span></span>
<span id="cb8-15"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            max_depth: Max depth for the forest's regression trees</span></span>
<span id="cb8-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            random_state: Random state to pass to random forest</span></span>
<span id="cb8-17"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            </span></span>
<span id="cb8-18"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            log_transform: Whether the input should be log-transformed</span></span>
<span id="cb8-19"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            first_differences: Whether the input should be singly differenced</span></span>
<span id="cb8-20"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            seasonal_differences: Seasonality to consider, if 'None' then no seasonality is presumed</span></span>
<span id="cb8-21"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb8-22">        </span>
<span id="cb8-23">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_lags <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_lags</span>
<span id="cb8-24">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> RandomForestRegressor(max_depth <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> max_depth, n_estimators <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_estimators, random_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> random_state)</span>
<span id="cb8-25">        </span>
<span id="cb8-26">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.log_transform <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> log_transform</span>
<span id="cb8-27">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.first_differences <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> first_differences</span>
<span id="cb8-28">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.seasonal_differences <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> seasonal_differences</span>
<span id="cb8-29">        </span>
<span id="cb8-30">        </span>
<span id="cb8-31">        </span>
<span id="cb8-32">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> fit(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y):</span>
<span id="cb8-33">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-34"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb8-35"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            y: training data (numpy array or pandas series/dataframe)</span></span>
<span id="cb8-36"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb8-37">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#enable pandas functions via dataframes</span></span>
<span id="cb8-38">        y_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.DataFrame(y)</span>
<span id="cb8-39">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> deepcopy(y_df)</span>
<span id="cb8-40">        </span>
<span id="cb8-41">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#apply transformations and store results for retransformations</span></span>
<span id="cb8-42">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.log_transform:</span>
<span id="cb8-43">            y_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(y_df)</span>
<span id="cb8-44">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_logged <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> deepcopy(y_df)</span>
<span id="cb8-45">        </span>
<span id="cb8-46">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.first_differences:</span>
<span id="cb8-47">            y_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_df.diff().dropna()</span>
<span id="cb8-48">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_diffed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> deepcopy(y_df)</span>
<span id="cb8-49">        </span>
<span id="cb8-50">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.seasonal_differences <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">is</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb8-51">            y_df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_df.diff(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.seasonal_differences).dropna()</span>
<span id="cb8-52">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_diffed_seasonal <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> deepcopy(y_df)</span>
<span id="cb8-53">        </span>
<span id="cb8-54">        </span>
<span id="cb8-55">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get lagged features</span></span>
<span id="cb8-56">        Xtrain <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([y_df.shift(t) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_lags<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],axis<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).dropna()</span>
<span id="cb8-57">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.Xtrain <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Xtrain</span>
<span id="cb8-58">        </span>
<span id="cb8-59">        ytrain <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_df.loc[Xtrain.index,:]</span>
<span id="cb8-60">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.ytrain <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ytrain</span>
<span id="cb8-61"></span>
<span id="cb8-62">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model.fit(Xtrain.values,ytrain.values.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb8-63"></span>
<span id="cb8-64">    </span>
<span id="cb8-65">    </span>
<span id="cb8-66">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> sample_forecast(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, n_periods <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, n_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>, random_seed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>):</span>
<span id="cb8-67">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-68"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Draw forecasting samples by randomly drawing from all trees in the forest per forecast period</span></span>
<span id="cb8-69"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb8-70"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_periods: Ammount of periods to forecast</span></span>
<span id="cb8-71"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_samples: Number of samples to draw</span></span>
<span id="cb8-72"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            random_seed: Random seed for numpy</span></span>
<span id="cb8-73"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb8-74">        samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._perform_forecast(n_periods, n_samples, random_seed)</span>
<span id="cb8-75">        output <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._retransform_forecast(samples, n_periods)</span>
<span id="cb8-76">        </span>
<span id="cb8-77">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> output</span>
<span id="cb8-78">    </span>
<span id="cb8-79">    </span>
<span id="cb8-80">    </span>
<span id="cb8-81">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _perform_forecast(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, n_periods, n_samples, random_seed):</span>
<span id="cb8-82">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-83"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Forecast transformed observations</span></span>
<span id="cb8-84"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb8-85"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_periods: Ammount of periods to forecast</span></span>
<span id="cb8-86"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_samples: Number of samples to draw</span></span>
<span id="cb8-87"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            random_seed: Random seed for numpy</span></span>
<span id="cb8-88"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb8-89">        samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb8-90">        </span>
<span id="cb8-91">        np.random.seed(random_seed)</span>
<span id="cb8-92">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_samples):</span>
<span id="cb8-93">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#store lagged features for each period</span></span>
<span id="cb8-94">            Xf <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.Xtrain.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:].values.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb8-95">                                 <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.ytrain.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].values.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-96"></span>
<span id="cb8-97">            forecasts <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb8-98"></span>
<span id="cb8-99">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_periods):</span>
<span id="cb8-100">                tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model.estimators_[np.random.randint(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.model.estimators_))]</span>
<span id="cb8-101">                pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tree.predict(Xf)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb8-102">                forecasts.append(pred)</span>
<span id="cb8-103">                </span>
<span id="cb8-104">                <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#update lagged features for next period</span></span>
<span id="cb8-105">                Xf <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate([Xf[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:],np.array([[pred]])],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb8-106">            </span>
<span id="cb8-107">            samples.append(forecasts)</span>
<span id="cb8-108">        </span>
<span id="cb8-109">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> samples</span>
<span id="cb8-110">    </span>
<span id="cb8-111">    </span>
<span id="cb8-112">    </span>
<span id="cb8-113">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _retransform_forecast(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, samples, n_periods):</span>
<span id="cb8-114">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb8-115"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Retransform forecast (re-difference and exponentiate)</span></span>
<span id="cb8-116"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb8-117"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            samples: Forecast samples for retransformation</span></span>
<span id="cb8-118"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_periods: Ammount of periods to forecast</span></span>
<span id="cb8-119"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb8-120">        </span>
<span id="cb8-121">        full_sample_tree <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb8-122"></span>
<span id="cb8-123">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> samp <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples:</span>
<span id="cb8-124">            draw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(samp)</span>
<span id="cb8-125">            </span>
<span id="cb8-126">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#retransform seasonal differencing</span></span>
<span id="cb8-127">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.seasonal_differences <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">is</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">not</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb8-128">                result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_diffed.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.seasonal_differences:].values)</span>
<span id="cb8-129">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_periods):</span>
<span id="cb8-130">                    result.append(result[t]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>draw[t])</span>
<span id="cb8-131">                result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> result[<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.seasonal_differences:]</span>
<span id="cb8-132">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb8-133">                result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb8-134">                <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_periods):</span>
<span id="cb8-135">                    result.append(draw[t])</span>
<span id="cb8-136">            </span>
<span id="cb8-137">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#retransform first differences</span></span>
<span id="cb8-138">            y_for_add <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_logged.values[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.log_transform <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.y_df.values[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb8-139">            </span>
<span id="cb8-140">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.first_differences:</span>
<span id="cb8-141">                result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_for_add <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> np.cumsum(result)</span>
<span id="cb8-142">            </span>
<span id="cb8-143">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#retransform log transformation</span></span>
<span id="cb8-144">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.log_transform:</span>
<span id="cb8-145">                result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.exp(result)</span>
<span id="cb8-146">            </span>
<span id="cb8-147">            full_sample_tree.append(result.reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb8-148"></span>
<span id="cb8-149">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> np.concatenate(full_sample_tree,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span></code></pre></div>
</div>
<p>As our data is strictly positive, has a trend and yearly seasonality, we apply the following transformations:</p>
<ul>
<li><strong>Logarithm transformation</strong>: Our forecasts then need to be re-transformed via an exponential transform. Thus, the exponentiated results will be strictly positive as well</li>
<li><strong>First differences</strong>: As mentioned above, this removes the linear trend in the data.</li>
<li><strong>Seasonal differences</strong>: <a href="https://faculty.fuqua.duke.edu/~rnau/Decision411_2007/Class10notes.htm?ref=sarem-seitz.com">Seasonal differencing</a> works like first differences with higher lag orders. Also, it allows us to remove both deterministic and stochastic seasonality. The main challenge with all these transformations is to correctly apply their inverse on our predictions. Luckily, the above model has these steps implemented already.</li>
</ul>
<section id="evaluating-the-random-forest-forecast" class="level3">
<h3 class="anchored" data-anchor-id="evaluating-the-random-forest-forecast">Evaluating the Random Forest forecast</h3>
<p>Using the data and the model, we get the following result for our test period:</p>
<div id="cell-16" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> RandomForestARModel(n_lags <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, log_transform <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, first_differences <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>, seasonal_differences <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>)</span>
<span id="cb9-2">model.fit(df_train)</span>
<span id="cb9-3"></span>
<span id="cb9-4">predictions_forest <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.sample_forecast(n_periods<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(df_test), n_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)</span>
<span id="cb9-5"></span>
<span id="cb9-6"></span>
<span id="cb9-7">means_forest <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(predictions_forest,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb9-8">lowers_forest <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(predictions_forest,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb9-9">uppers_forest <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(predictions_forest,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb9-10"></span>
<span id="cb9-11">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>))</span>
<span id="cb9-12"></span>
<span id="cb9-13">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb9-14"></span>
<span id="cb9-15">plt.plot(df.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">120</span>:], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training observations (truncated)"</span>)</span>
<span id="cb9-16">plt.plot(df_test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Out-of-sample observations"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb9-17"></span>
<span id="cb9-18">plt.plot(df_test.index,means_forest,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"RF mean forecast"</span>)</span>
<span id="cb9-19"></span>
<span id="cb9-20">plt.fill_between(df_test.index, lowers_forest, uppers_forest, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"RF 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast inverval"</span>)</span>
<span id="cb9-21"></span>
<span id="cb9-22">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb9-23">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-8-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This looks quite good. To verify that we were not just lucky, we use a <a href="https://www.sarem-seitz.com/facebook-prophet-covid-and-why-i-dont-trust-the-prophet/#:~:text=An%20even%20simpler-,forecast,-model">simple benchmark</a> for comparison:</p>
<div id="cell-18" class="cell" data-execution_count="19">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> gaussian_kde</span>
<span id="cb10-2"></span>
<span id="cb10-3">df_train_diffed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(df_train[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"sales"</span>]).diff().dropna()</span>
<span id="cb10-4">df_train_trans <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df_train_diffed.diff(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>).dropna()</span>
<span id="cb10-5"></span>
<span id="cb10-6">kde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gaussian_kde(df_train_trans.values)</span>
<span id="cb10-7"></span>
<span id="cb10-8">target_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(df_train_trans.values)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(df_train_trans.values)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,num<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb10-9"></span>
<span id="cb10-10">full_sample_toy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [] </span>
<span id="cb10-11">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb10-12"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>):</span>
<span id="cb10-13">    draw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> kde.resample(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(df_test)).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb10-14">    result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(df_train_diffed.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:].values)</span>
<span id="cb10-15"></span>
<span id="cb10-16">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(df_test)):</span>
<span id="cb10-17">        result.append(result[t]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>draw[t])</span>
<span id="cb10-18"></span>
<span id="cb10-19">    full_sample_toy.append(np.exp(np.array((np.log(df_train.values[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(result[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:]))).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)))</span>
<span id="cb10-20"></span>
<span id="cb10-21">    </span>
<span id="cb10-22">predictions_toy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate(full_sample_toy,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb10-23">means_toy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(predictions_toy,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb10-24">lowers_toy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(predictions_toy,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb10-25">uppers_toy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(predictions_toy,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb10-26"></span>
<span id="cb10-27">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>))</span>
<span id="cb10-28"></span>
<span id="cb10-29">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb10-30"></span>
<span id="cb10-31">plt.plot(df.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">120</span>:], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training observations (truncated)"</span>)</span>
<span id="cb10-32">plt.plot(df_test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Out-of-sample observations"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb10-33"></span>
<span id="cb10-34">plt.plot(df_test.index,means_toy,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Benchmark mean forecast"</span>)</span>
<span id="cb10-35"></span>
<span id="cb10-36">plt.fill_between(df_test.index, lowers_toy, uppers_toy, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Benchmark 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast inverval"</span>)</span>
<span id="cb10-37"></span>
<span id="cb10-38">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb10-39">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-9-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Apparently, the benchmark intervals are much worse than for the Random Forest. The mean forecast starts out reasonably but quickly deteriorates after a few steps.</p>
<p>Let’s compare both mean forecasts in a single chart:</p>
<div id="cell-20" class="cell" data-execution_count="21">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>))</span>
<span id="cb11-2"></span>
<span id="cb11-3">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb11-4"></span>
<span id="cb11-5">plt.plot(df.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">120</span>:], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training observations (truncated)"</span>)</span>
<span id="cb11-6">plt.plot(df_test, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Out-of-sample observations"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>)</span>
<span id="cb11-7"></span>
<span id="cb11-8">plt.plot(df_test.index,means_forest,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"RF mean forecast"</span>,lw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb11-9">plt.plot(df_test.index,means_toy,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Benchmark mean forecast"</span>, lw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb11-10"></span>
<span id="cb11-11">plt.legend(fontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>)</span>
<span id="cb11-12">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests_files/figure-html/cell-10-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<div id="cell-21" class="cell" data-execution_count="22">
<div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb12-1">rmse_forest <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((df_test.values[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> means_forest)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb12-2">rmse_toy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((df_test.values[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> means_toy)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb12-3"></span>
<span id="cb12-4"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Random Forest: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(rmse_forest))</span>
<span id="cb12-5"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Benchmark: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(rmse_toy))</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>Random Forest: 909.7996221364062
Benchmark: 6318.1429838549</code></pre>
</div>
</div>
<p>Clearly, the Random Forest is far superior for longer horizon forecasts.</p>
</section>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Hopefully, this article gave you some insights on the do’s and dont’s of forecasting with tree models. While a single Decision Tree might be useful sometimes, Random Forests are usually more performant. That is, unless your dataset is very tiny in which case you could still reduce <code>max_depth</code> of your forest trees.</p>
<p>Obviously, you could add easily add external regressors to either model to improve performance further. As an example, adding monthly indicators to our model might yield more accurate results than right now.</p>
<p>As an alternative to Random Forests, Gradient Boosting could be considered. <a href="https://nixtla.github.io/mlforecast/distributed.models.xgb.html?ref=sarem-seitz.com">Nixtla’s mlforecast</a> package has a very powerful implementation - besides all their other great tools for forecasting. Keep in mind however, that we cannot transfer the algorithm for forecast intervals to Gradient Boosting.</p>
<p>On another note, keep in mind that forecasting with advanced machine learning is a double-edged sword. While powerful at the surface, ML for time-series can overfit much quicker than for cross-sectional problems. As long as you properly test your model against some benchmarks, though, they should not be overlooked either.</p>
<p>PS: You can find a full notebook for this article <a href="https://github.com/SaremS/sample_notebooks/blob/master/Probabilistic%20Forecasts%20with%20Random%20Forests.ipynb?ref=sarem-seitz.com">here</a>.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Breiman, Leo. Random forests. Machine learning, 2001, 45.1, p.&nbsp;5-32.</p>
<p><strong>[2]</strong> Breiman, Leo, et al.&nbsp;Classification and regression trees. Routledge, 2017.</p>
<p><strong>[3]</strong> Hamilton, James Douglas. Time series analysis. Princeton university press, 2020.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <category>Decision Trees</category>
  <guid>https://www.sarem-seitz.com/posts/forecasting-with-decision-trees-and-random-forests.html</guid>
  <pubDate>Mon, 19 Sep 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Multivariate GARCH with Python and Tensorflow</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/multivariate-garch-with-python-and-tensorflow.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>In an <a href="http://sarem-seitz.com/lets-make-garch-more-flexible-with-normalizing-flows/?ref=sarem-seitz.com">earlier article</a>, we discussed how to replace the conditional Gaussian assumption in a traditional GARCH model. While such gimmicks are a good start, they are far from being useful for actual applications.</p>
<p>One primary limitation is the obvious restriction to a single dimensional time-series. In reality, however, we are typically dealing with multiple time-series. Thus, a multivariate GARCH model would be much more appropriate.</p>
<p>Technically, we could fit a separate GARCH model for each series and handle interdependencies afterwards. As long as correlations between the time-series can be presumed constant, this can be a valid and straightforward solution. Once correlation becomes dynamic, however, we could lose important information that way.</p>
<p>As a motivating example, consider stock market returns of correlated assets. It is a commonly observed phenomenon that <a href="https://www.twosigma.com/articles/asset-class-correlations-return-to-normalcy/?ref=sarem-seitz.com">asset returns’ correlation tends to increase heavily during times of crisis</a>. In consequence, ignoring such dynamics would be rather unreasonable given such convincing evidence.</p>
<p>Multivariate GARCH models, namely models for dynamic conditional correlation (DCC), are what we need in this case. The DCC model dates back to the early 2000s, starting with a seminal paper by <a href="https://www.jstor.org/stable/pdf/1392121.pdf?refreqid=excelsior%3A4cab7142cd1ac1427c35cfcfbbf8ab98&amp;ab_segments=&amp;origin=&amp;acceptTC=1&amp;ref=sarem-seitz.com">Robert Engle</a>. For this article, we will closely work with his notation.</p>
</section>
<section id="from-garch-to-multivariate-garch-and-dcc" class="level2">
<h2 class="anchored" data-anchor-id="from-garch-to-multivariate-garch-and-dcc">From GARCH to multivariate GARCH and DCC</h2>
<p>Remember that, for univariate Normal GARCH, we have the following formulas: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Au_%7Bt%20%5Cmid%20t-1%7D%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0%20;%20%5Csigma_%7Bt%20%5Cmid%20t-1%7D%5E2%5Cright)%20%5C%5C%0A%5Csigma_%7Bt%20%5Cmid%20t-1%7D%5E2=%5Calpha_0+%5Calpha%20%5Csigma_%7Bt-1%20%5Cmid%20t-2%7D%5E2+%5Cbeta%20u_%7Bt-1%20%5Cmid%20t-2%7D%5E2%20%5C%5C%0A%5Calpha_0,%20%5Calpha,%20%5Cbeta%20%5Cgeq%200%20%5Cquad%20%5Calpha+%5Cbeta%3C1%0A%5Cend%7Bgathered%7D%0A"> For a deeper look at GARCH and its predecessor ARCH, I recommend reading the original papers (<a href="https://www.jstor.org/stable/1912773?ref=sarem-seitz.com">ARCH</a>, <a href="For a deeper look at GARCH and its predecessor ARCH, I recommend reading the original papers (ARCH, GARCH).">GARCH</a>).</p>
<p>Over the years, numerous extensions have been proposed to address the shortcomings of this base model - for example</p>
<ul>
<li><a href="http://www.stat.tugraz.at/AJS/ausg123/123Tayefi.pdf?ref=sarem-seitz.com">FIGARCH</a> to model long memory of shocks in the conditional variance equation</li>
<li><a href="https://vlab.stern.nyu.edu/docs/volatility/EGARCH?ref=sarem-seitz.com">EGARCH</a> for asymmetric effects of positive and negative shocks in the conditional variance</li>
<li>…and <a href="https://edoc.hu-berlin.de/bitstream/handle/18452/4120/20.pdf?sequence=1&amp;ref=sarem-seitz.com">various approaches</a> to make the conditional variance term non-linear As we will see, all these variations of univariate GARCH can be used in a multivariate GARCH/DCC model.</li>
</ul>
</section>
<section id="general-introduction-to-multivariate-garch" class="level2">
<h2 class="anchored" data-anchor-id="general-introduction-to-multivariate-garch">General introduction to multivariate GARCH</h2>
<p>First, let us introduce a bi-variate random variable <img src="https://latex.codecogs.com/png.latex?%0AU_%7Bt%20%5Cmid%20t-1%7D=%5Cleft(%5Cbegin%7Barray%7D%7Bl%7D%0Au_%7B1,%20t%20%5Cmid%20t-1%7D%20%5C%5C%0Au_%7B2,%20t%20%5Cmid%20t-1%7D%0A%5Cend%7Barray%7D%5Cright)%0A"> with covariance matrix <img src="https://latex.codecogs.com/png.latex?%0A%5CSigma_%7Bt%20%5Cmid%20t-1%7D=%5Cleft%5B%5Cbegin%7Barray%7D%7Bcc%7D%0A%5Csigma_%7B1,%20t%20%5Cmid%20t-1%7D%5E2%20&amp;%20%5Csigma_%7B12,%20t%20%5Cmid%20t-1%7D%20%5C%5C%0A%5Csigma_%7B12,%20t%20%5Cmid%20t-1%7D%20&amp;%20%5Csigma_%7B2,%20t%20%5Cmid%20t-1%7D%5E2%0A%5Cend%7Barray%7D%5Cright%5D%0A"> In addition, we define <img src="https://latex.codecogs.com/png.latex?%0A%5CPsi_%7Bt%20%5Cmid%20t-1%7D=U_%7Bt%20%5Cmid%20t-1%7D%20U_%7Bt%20%5Cmid%20t-1%7D%5ET=%5Cleft%5B%5Cbegin%7Barray%7D%7Bcc%7D%0Au_%7B1,%20t%20%5Cmid%20t-1%7D%5E2%20&amp;%20u_%7B1,%20t%20%5Cmid%20t-1%7D%20u_%7B2,%20t%20%5Cmid%20t-1%7D%20%5C%5C%0Au_%7B1,%20t%20%5Cmid%20t-1%7D%20u_%7B2,%20t%20%5Cmid%20t-1%7D%20&amp;%20u_%7B2,%20t%20%5Cmid%20t-1%7D%5E2%0A%5Cend%7Barray%7D%5Cright%5D%0A"> It can easily be seen that this matrix generalizes the squared observation term from the univariate GARCH model.</p>
<p>We could now generalize this to higher variate random variables and higher lag dependencies. For convenience, however, let us stick with the above.</p>
<p>Our goal then is to find an explicit formula to model the covariance matrix’ dependency on the past. For this, we follow the tradition of GARCH models. I.e., we condition covariance linearly on past covariances <strong>and</strong> past realizations of the actual random variables.</p>
<p>Notice that the obvious linear transformation <img src="https://latex.codecogs.com/png.latex?%0A%5CSigma_%7Bt%20%5Cmid%20t-1%7D=A_0+A%5ET%20%5CSigma_%7Bt-1%20%5Cmid%20t-2%7D%20A+B%5ET%20%5CPsi_%7Bt-1%20%5Cmid%20t-2%7D%20B,%0A"> (with <img src="https://latex.codecogs.com/png.latex?A_0,%20A,%20B%20%5Cin%20%5Cmathbb%7BR%7D%5E%7B2%20%5Ctimes%202%7D,%20A_0"> positive semi-definite) would be reasonable but highly inefficient for higher dimensions. After all, for a lag-5 model, we would already have 375 free variables. In relation to daily time-series, this is more than a year worth of data.</p>
<p>As a first restriction, it makes sense to avoid redundancies due to the symmetry of the covariance matrix. We introduce the following operation: <img src="https://latex.codecogs.com/png.latex?%0A%5Coperatorname%7Bvech%7D%5Cleft(%5CSigma_%7Bt%20%5Cmid%20t-1%7D%5Cright)=%5Cleft(%5Cbegin%7Barray%7D%7Bc%7D%0A%5Csigma_%7B1,%20t%20%5Cmid%20t-1%7D%5E2%20%5C%5C%0A%5Csigma_%7B1,%20t%20%5Cmid%20t-1%7D%20%5Csigma_%7B2,%20t%20%5Cmid%20t-1%7D%20%5C%5C%0A%5Csigma_%7B2,%20t%20%5Cmid%20t-1%7D%5E2%0A%5Cend%7Barray%7D%5Cright)%0A"> Put simply, we stack all elements of the matrix into a vector while removing duplicates. This allows the following simplification to our initial multivariate GARCH model: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Coperatorname%7Bvech%7D%5Cleft(%5CSigma_%7Bt%20%5Cmid%20t-1%7D%5Cright)=%5Cmathbf%7Ba%7D_%7B%5Cmathbf%7B0%7D%7D+A%20%5Coperatorname%7Bvech%7D%5Cleft(%5CSigma_%7Bt-1%20%5Cmid%20t-2%7D%5Cright)+B%20%5Coperatorname%7Bvech%7D%5Cleft(%5CPsi_%7Bt-1%20%5Cmid%20t-2%7D%5Cright)%20%5C%5C%0A%5Cmathbf%7Ba%7D_%7B%5Cmathbf%7B0%7D%7D%20%5Cin%20%5Cmathbb%7BR%7D%5E3%20;%20A,%20B%20%5Cin%20%5Cmathbb%7BR%7D%5E%7B3%20%5Ctimes%203%7D%0A%5Cend%7Bgathered%7D%0A"> For the order 5-lag model, this specification reduces the amount of free variables to <strong>45</strong>. As this is still quite high, we could impose some restrictions on our model matrices, for example <img src="https://latex.codecogs.com/png.latex?%0AA=%5Cleft%5B%5Cbegin%7Barray%7D%7Bccc%7D%0Aa_1%20&amp;%200%20&amp;%200%20%5C%5C%0A0%20&amp;%20a_2%20&amp;%200%20%5C%5C%0A0%20&amp;%200%20&amp;%20a_3%0A%5Cend%7Barray%7D%5Cright%5D%20%5Cquad%20B=%5Cleft%5B%5Cbegin%7Barray%7D%7Bccc%7D%0Ab_1%20&amp;%200%20&amp;%200%20%5C%5C%0A0%20&amp;%20b_2%20&amp;%200%20%5C%5C%0A0%20&amp;%200%20&amp;%20b_3%0A%5Cend%7Barray%7D%5Cright%5D%0A"> i.e.&nbsp;the matrices a diagonal. Going back, again, to the lag-5 model, we would now be down to <strong>15</strong> free variables.</p>
<section id="multivariate-garch-with-constant-and-dynamic-correlation" class="level3">
<h3 class="anchored" data-anchor-id="multivariate-garch-with-constant-and-dynamic-correlation">Multivariate GARCH with constant and dynamic correlation</h3>
<p>Another class of multivariate GARCH specifications has been proposed by <a href="http://public.econ.duke.edu/~boller/Published_Papers/restat_90.pdf?ref=sarem-seitz.com">Bollerslev</a> and <a href="https://www.jstor.org/stable/pdf/1392121.pdf?refreqid=excelsior%3A4cab7142cd1ac1427c35cfcfbbf8ab98&amp;ab_segments=&amp;origin=&amp;acceptTC=1&amp;ref=sarem-seitz.com">Engle</a>. The core idea is to splitconditional covariance into conditional standard deviations and conditional correlations: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A%5CSigma_%7Bt%20%5Cmid%20t-1%7D%20&amp;%20=D_%7Bt%20%5Cmid%20t-1%7D%20R_%7Bt%20%5Cmid%20t-1%7D%20D_%7Bt%20%5Cmid%20t-1%7D%20%5C%5C%0AD_%7Bt%20%5Cmid%20t-1%7D%20&amp;%20=%5Cleft%5B%5Cbegin%7Barray%7D%7Bcc%7D%0A%5Csigma_%7B1,%20t%20%5Cmid%20t-1%7D%20&amp;%200%20%5C%5C%0A0%20&amp;%20%5Csigma_%7B2,%20t%20%5Cmid%20t-1%7D%0A%5Cend%7Barray%7D%5Cright%5D%20%5C%5C%0AR_%7Bt%20%5Cmid%20t-1%7D%20&amp;%20=%5Cleft%5B%5Cbegin%7Barray%7D%7Bcc%7D%0A1%20&amp;%20%5Crho_%7B12,%20t%20%5Cmid%20t-1%7D%20%5C%5C%0A%5Crho_%7B12,%20t%20%5Cmid%20t-1%7D%20&amp;%201%0A%5Cend%7Barray%7D%5Cright%5D%0A%5Cend%7Baligned%7D%0A"> Now, the conditional standard deviations can be modelled as the square roots of independent GARCH models. This leaves room for choosing any GARCH model that is deemed appropriate.</p>
<p>The correlation component can be presumed constant (= <strong>C</strong>onstant <strong>c</strong>onditional <strong>c</strong>orrelation, CCC) or auto-regressive (= <strong>D</strong>ynamic <strong>c</strong>onditional <strong>c</strong>orrelation, DCC). For the latter, we can do the following: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AR_%7Bt%20%5Cmid%20t-1%7D=%5Cleft(%5Csqrt%7B%5Coperatorname%7Bdiag%7D%5Cleft(Q_%7Bt%20%5Cmid%20t-1%7D%5Cright)%7D%20%5Ccdot%20I%5Cright)%5E%7B-1%7D%20Q_%7Bt%20%5Cmid%20t-1%7D%5Cleft(%5Csqrt%7B%5Coperatorname%7Bdiag%7D%5Cleft(Q_%7Bt%20%5Cmid%20t-1%7D%5Cright)%7D%20%5Ccdot%20I%5Cright)%5E%7B-1%7D%20%5C%5C%0AQ_%7Bt%20%5Cmid%20t-1%7D=A_0+%5Cleft(A%20%5Ccirc%20Q_%7Bt-1%20%5Cmid%20t-2%7D-A_0%5Cright)+%5Cleft(B%20%5Ccirc%20%5Ctilde%7B%5CPsi%7D_%7Bt-1%20%5Cmid%20t-2%7D-A_0%5Cright)%0A%5Cend%7Bgathered%7D%0A"> where <img src="https://latex.codecogs.com/png.latex?%0AA_0,%20A,%20B%20%5Cin%20%5Cmathbb%7BR%7D%5E%7B2%20%5Ctimes%202%7D%0A"> <img src="https://latex.codecogs.com/png.latex?A_0,%20%5Cmathbf%7B1%201%7D%5ET-A-B%20%5Csucccurlyeq%200"> (positive semi-definite) and <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Ctilde%7B%5CPsi%7D_%7Bt-1%20%5Cmid%20t-2%7D=%5Ctilde%7BU%7D_%7Bt-1%20%5Cmid%20t-2%7D%20%5Ctilde%7BU%7D_%7Bt-1%20%5Cmid%20t-2%7D%5ET%20%5C%5C%0A=%5Cleft%5B%5Cbegin%7Barray%7D%7Bcc%7D%0A%5Cfrac%7Bu_%7B1,%20t-1%20%5Cmid%20t-2%7D%5E2%7D%7B%5Csigma_%7B1,%20t-1%20%5Cmid%20t-2%7D%5E2%7D%20&amp;%20%5Cfrac%7Bu_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%7B%5Csigma_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%20%5Cfrac%7Bu_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%7B%5Csigma_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%20%5C%5C%0A%5Cfrac%7Bu_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%7B%5Csigma_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%20%5Cfrac%7Bu_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%7B%5Csigma_%7B1,%20t-1%20%5Cmid%20t-2%7D%7D%20&amp;%20%5Cfrac%7Bu_%7B2,%20t-1%20%5Cmid%20t-2%7D%5E2%7D%7B%5Csigma_%7B2,%20t-1%20%5Cmid%20t-2%7D%5E2%7D%0A%5Cend%7Barray%7D%5Cright%5D%0A%5Cend%7Bgathered%7D%0A"> As you can see, the un-normalized conditional correlation now follows an error-correction like term. Finally, to reduce the amount of free parameters, we can replace the matrices by scalars to get <img src="https://latex.codecogs.com/png.latex?%0AQ_%7Bt%20%5Cmid%20t-1%7D=A_0+%5Cleft(a%20%5Ccdot%20Q_%7Bt-1%20%5Cmid%20t-2%7D-A_0%5Cright)+%5Cleft(b%20%5Ccdot%20%5Ctilde%7B%5CPsi%7D_%7Bt-1%20%5Cmid%20t-2%7D-A_0%5Cright)%0A"> where <img src="https://latex.codecogs.com/png.latex?%0Aa,%20b%20%5Cgeq%200%20%5Cquad%20a+b%3C1%20.%0A"> On the one hand this formulation is less expressive than before. On the other hand, ensuring stationarity is much easier from a programmatic point of view.</p>
</section>
</section>
<section id="using-python-and-tensorflow-to-implement-dcc" class="level2">
<h2 class="anchored" data-anchor-id="using-python-and-tensorflow-to-implement-dcc">Using Python and Tensorflow to implement DCC</h2>
<p>Let us start with the full implementation and then look at the details:</p>
<div id="cell-6" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> tensorflow <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> tf</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> tensorflow_probability <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> tfp</span>
<span id="cb1-3"></span>
<span id="cb1-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> MGARCH_DCC(tf.keras.Model):</span>
<span id="cb1-5">    <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-6"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Tensorflow/Keras implementation of multivariate GARCH under dynamic conditional correlation (DCC) specification.</span></span>
<span id="cb1-7"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    Further reading:</span></span>
<span id="cb1-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        - Engle, Robert. "Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models."</span></span>
<span id="cb1-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        - Bollerslev, Tim. "Modeling the Coherence in Short-Run Nominal Exchange Rates: A Multi-variate Generalized ARCH Model."</span></span>
<span id="cb1-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        - Lütkepohl, Helmut. "New introduction to multiple time series analysis."</span></span>
<span id="cb1-11"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">    """</span></span>
<span id="cb1-12">    </span>
<span id="cb1-13">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y):</span>
<span id="cb1-14">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-15"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            y: NxM numpy.array of N observations of M correlated time-series</span></span>
<span id="cb1-17"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-18">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb1-19">        n_dims <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y.shape[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-20">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> n_dims</span>
<span id="cb1-21">        </span>
<span id="cb1-22">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(np.mean(y,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#use a mean variable</span></span>
<span id="cb1-23">        </span>
<span id="cb1-24">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(np.std(y,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#initial standard deviations at t=0</span></span>
<span id="cb1-25">        </span>
<span id="cb1-26">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#we initialize all restricted parameters to lie inside the desired range</span></span>
<span id="cb1-27">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#by keeping the learning rate low, this should result in admissible results</span></span>
<span id="cb1-28">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#for more complex models, this might not suffice</span></span>
<span id="cb1-29">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.alpha0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(np.std(y,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>))</span>
<span id="cb1-30">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.zeros(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(n_dims,))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span>)</span>
<span id="cb1-31">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.zeros(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(n_dims,))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span>)</span>
<span id="cb1-32">        </span>
<span id="cb1-33">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.L0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(np.float32(np.linalg.cholesky(np.corrcoef(y.T)))) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#decomposition of A_0</span></span>
<span id="cb1-34">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.A <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.zeros(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.9</span>)</span>
<span id="cb1-35">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.B <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.zeros(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)</span>
<span id="cb1-36">        </span>
<span id="cb1-37">           </span>
<span id="cb1-38">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> call(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y):</span>
<span id="cb1-39">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-40"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-41"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            y: NxM numpy.array of N observations of M correlated time-series</span></span>
<span id="cb1-42"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-43">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.get_conditional_dists(y)</span>
<span id="cb1-44">    </span>
<span id="cb1-45">    </span>
<span id="cb1-46">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_log_probs(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y):</span>
<span id="cb1-47">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-48"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Calculate log probabilities for a given matrix of time-series observations</span></span>
<span id="cb1-49"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-50"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            y: NxM numpy.array of N observations of M correlated time-series</span></span>
<span id="cb1-51"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-52">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.get_conditional_dists(y).log_prob(y)</span>
<span id="cb1-53">    </span>
<span id="cb1-54">        </span>
<span id="cb1-55">    <span class="at" style="color: #657422;
background-color: null;
font-style: inherit;">@tf.function</span></span>
<span id="cb1-56">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_conditional_dists(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y):</span>
<span id="cb1-57">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-58"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Calculate conditional distributions for given observations</span></span>
<span id="cb1-59"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-60"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            y: NxM numpy.array of N observations of M correlated time-series</span></span>
<span id="cb1-61"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-62">        T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.shape(y)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb1-63">        </span>
<span id="cb1-64">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#create containers for looping</span></span>
<span id="cb1-65">        mus <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.TensorArray(tf.float32, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> T) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#observation mean container</span></span>
<span id="cb1-66">        Sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.TensorArray(tf.float32, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> T) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#observation covariance container</span></span>
<span id="cb1-67"></span>
<span id="cb1-68">        </span>
<span id="cb1-69">        sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.TensorArray(tf.float32, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-70">        us <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.TensorArray(tf.float32, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-71">        Qs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.TensorArray(tf.float32, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-72">        </span>
<span id="cb1-73">        </span>
<span id="cb1-74">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#initialize respective values for t=0</span></span>
<span id="cb1-75">        sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas.write(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma0)</span>
<span id="cb1-76">        A0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.transpose(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.L0)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>self.L0</span>
<span id="cb1-77">        Qs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Qs.write(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, A0) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#set initial unnormalized correlation equal to mean matrix</span></span>
<span id="cb1-78">        us <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> us.write(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, tf.zeros(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims,))) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#initial observations equal to zero</span></span>
<span id="cb1-79">        </span>
<span id="cb1-80">        </span>
<span id="cb1-81">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#convenience</span></span>
<span id="cb1-82">        sigma0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma0</span>
<span id="cb1-83">        alpha0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.alpha0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#ensure positivity</span></span>
<span id="cb1-84">        alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.alpha</span>
<span id="cb1-85">        beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.beta</span>
<span id="cb1-86"></span>
<span id="cb1-87">        A <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.A</span>
<span id="cb1-88">        B <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.B</span>
<span id="cb1-89">        </span>
<span id="cb1-90">        </span>
<span id="cb1-91">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> tf.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(T):</span>
<span id="cb1-92">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#tm1 = 't minus 1'</span></span>
<span id="cb1-93">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#suppress conditioning on past in notation</span></span>
<span id="cb1-94">            </span>
<span id="cb1-95">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#1) calculate conditional standard deviations</span></span>
<span id="cb1-96">            u_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> us.read(t) </span>
<span id="cb1-97">            sigma_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas.read(t)</span>
<span id="cb1-98">            </span>
<span id="cb1-99">            sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (alpha0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>u_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb1-100">            </span>
<span id="cb1-101">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#2) calculate conditional correlations</span></span>
<span id="cb1-102">            u_tm1_standardized <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> u_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>sigma_tm1</span>
<span id="cb1-103">                   </span>
<span id="cb1-104">            Psi_tilde_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.reshape(u_tm1_standardized, (<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>tf.reshape(u_tm1_standardized, (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims))</span>
<span id="cb1-105"></span>
<span id="cb1-106">            Q_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Qs.read(t)</span>
<span id="cb1-107">            Q_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> A0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> A<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(Q_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> A0) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> B<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(Psi_tilde_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> A0)</span>
<span id="cb1-108">            R_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.cov_to_corr(Q_t)</span>
<span id="cb1-109">            </span>
<span id="cb1-110">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#3) calculate conditional covariance</span></span>
<span id="cb1-111">            D_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.linalg.LinearOperatorDiag(sigma_t)</span>
<span id="cb1-112">            Sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> D_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>R_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>D_t</span>
<span id="cb1-113">              </span>
<span id="cb1-114">            </span>
<span id="cb1-115">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#4) store values for next iteration</span></span>
<span id="cb1-116">            sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas.write(t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, sigma_t)</span>
<span id="cb1-117">            us <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> us.write(t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, y[t,:]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#we want to model the zero-mean disturbances</span></span>
<span id="cb1-118">            Qs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Qs.write(t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, Q_t)</span>
<span id="cb1-119">            </span>
<span id="cb1-120">            mus <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mus.write(t, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU)</span>
<span id="cb1-121">            Sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Sigmas.write(t, Sigma_t)</span>
<span id="cb1-122">            </span>
<span id="cb1-123">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> tfp.distributions.MultivariateNormalFullCovariance(mus.stack(), Sigmas.stack())</span>
<span id="cb1-124">    </span>
<span id="cb1-125">    </span>
<span id="cb1-126">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> cov_to_corr(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, S):</span>
<span id="cb1-127">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-128"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Transforms covariance matrix to a correlation matrix via matrix operations</span></span>
<span id="cb1-129"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-130"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            S: Symmetric, positive semidefinite covariance matrix (tf.Tensor)</span></span>
<span id="cb1-131"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-132">        D <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.linalg.LinearOperatorDiag(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>(tf.linalg.diag_part(S)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>))</span>
<span id="cb1-133">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> D<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>S<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>D</span>
<span id="cb1-134">        </span>
<span id="cb1-135">    </span>
<span id="cb1-136"></span>
<span id="cb1-137">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> train_step(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, data):</span>
<span id="cb1-138">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-139"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Custom training step to handle keras model.fit given that there is no input-output structure in our model</span></span>
<span id="cb1-140"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-141"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            S: Symmetric, positive semidefinite covariance matrix (tf.Tensor)</span></span>
<span id="cb1-142"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-143">        x,y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data</span>
<span id="cb1-144">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">with</span> tf.GradientTape() <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> tape:</span>
<span id="cb1-145">            loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>tf.math.reduce_mean(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.get_log_probs(y))</span>
<span id="cb1-146">            </span>
<span id="cb1-147">        trainable_vars <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.trainable_weights</span>
<span id="cb1-148">        gradients <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tape.gradient(loss, trainable_vars)</span>
<span id="cb1-149">        </span>
<span id="cb1-150">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.optimizer.apply_gradients(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">zip</span>(gradients, trainable_vars))</span>
<span id="cb1-151">        </span>
<span id="cb1-152">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Current loss"</span>: loss}</span>
<span id="cb1-153">    </span>
<span id="cb1-154">    </span>
<span id="cb1-155">    </span>
<span id="cb1-156">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> sample_forecast(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y, T_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, n_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>):</span>
<span id="cb1-157">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">"""</span></span>
<span id="cb1-158"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Create forecast samples to use for monte-carlo simulation of quantities of interest about the forecast (e.g. mean, var, corr, etc.)</span></span>
<span id="cb1-159"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        </span><span class="al" style="color: #AD0000;
background-color: null;
font-style: inherit;">WARNING</span><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">: This is not optimized very much and can take some time to run, probably due to Python's slow loops - can likely be improved</span></span>
<span id="cb1-160"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        Args:</span></span>
<span id="cb1-161"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            y: numpy.array of training data, used to initialize the forecast values</span></span>
<span id="cb1-162"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            T_forecast: number of periods to predict (integer)</span></span>
<span id="cb1-163"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">            n_samples: Number of samples to draw (integer)</span></span>
<span id="cb1-164"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">        """</span></span>
<span id="cb1-165">        T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.shape(y)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb1-166">        </span>
<span id="cb1-167">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#create lists for looping; no gradients, thus no tf.TensorArrays needed</span></span>
<span id="cb1-168">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#can initialize directly</span></span>
<span id="cb1-169">        mus <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-170">        Sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-171"></span>
<span id="cb1-172">        us <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [tf.zeros(shape<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims,))]</span>
<span id="cb1-173">        sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma0]        </span>
<span id="cb1-174">        Qs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-175">        </span>
<span id="cb1-176">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#initialize remaining values for t=0</span></span>
<span id="cb1-177">        A0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.transpose(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.L0)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>self.L0</span>
<span id="cb1-178">        Qs.append(A0)</span>
<span id="cb1-179">        </span>
<span id="cb1-180">        </span>
<span id="cb1-181">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#convenience</span></span>
<span id="cb1-182">        sigma0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.sigma0 </span>
<span id="cb1-183">        alpha0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.alpha0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#ensure positivity</span></span>
<span id="cb1-184">        alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.alpha</span>
<span id="cb1-185">        beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.beta</span>
<span id="cb1-186"></span>
<span id="cb1-187">        A <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.A</span>
<span id="cb1-188">        B <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.B</span>
<span id="cb1-189">        </span>
<span id="cb1-190">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#'warmup' to initialize latest lagged features</span></span>
<span id="cb1-191">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(T):</span>
<span id="cb1-192">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#tm1 = 't minus 1'</span></span>
<span id="cb1-193">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#suppress conditioning on past in notation</span></span>
<span id="cb1-194">            u_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> us[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-195">            sigma_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-196">            </span>
<span id="cb1-197">            sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (alpha0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>u_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb1-198">            </span>
<span id="cb1-199">            u_tm1_standardized <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> u_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>sigma_tm1</span>
<span id="cb1-200">            </span>
<span id="cb1-201">            Psi_tilde_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.reshape(u_tm1_standardized, (<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>tf.reshape(u_tm1_standardized, (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims))</span>
<span id="cb1-202"></span>
<span id="cb1-203">            Q_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Qs[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-204">            Q_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> A0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> A<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(Q_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> A0) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> B<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(Psi_tilde_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> A0)</span>
<span id="cb1-205">            R_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.cov_to_corr(Q_t)</span>
<span id="cb1-206">            </span>
<span id="cb1-207">            D_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.linalg.LinearOperatorDiag(sigma_t)</span>
<span id="cb1-208">            Sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> D_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>R_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>D_t</span>
<span id="cb1-209">              </span>
<span id="cb1-210">            </span>
<span id="cb1-211">            sigmas.append(sigma_t)</span>
<span id="cb1-212">            us.append(y[t,:]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#we want to model the zero-mean disturbances</span></span>
<span id="cb1-213">            Qs.append(Q_t)</span>
<span id="cb1-214">            </span>
<span id="cb1-215">            mus.append(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU)</span>
<span id="cb1-216">            Sigmas.append(Sigma_t)</span>
<span id="cb1-217">  </span>
<span id="cb1-218">            </span>
<span id="cb1-219">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#sample containers</span></span>
<span id="cb1-220">        y_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-221">        R_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-222">        sigma_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-223">        </span>
<span id="cb1-224">        </span>
<span id="cb1-225">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> n <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(n_samples):</span>
<span id="cb1-226">            </span>
<span id="cb1-227">            mus_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-228">            Sigmas_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-229"></span>
<span id="cb1-230">            sigmas_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [sigmas[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]</span>
<span id="cb1-231">            us_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [us[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]</span>
<span id="cb1-232">            Qs_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [Qs[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]</span>
<span id="cb1-233">            </span>
<span id="cb1-234">            <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#forecast containers</span></span>
<span id="cb1-235">            ys_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-236">            sig_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-237">            R_samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [] </span>
<span id="cb1-238">            </span>
<span id="cb1-239">            </span>
<span id="cb1-240">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(T_forecast):</span>
<span id="cb1-241">                u_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> us_samp[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-242">                sigma_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas_samp[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-243"></span>
<span id="cb1-244">                sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (alpha0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>u_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span></span>
<span id="cb1-245"></span>
<span id="cb1-246">                u_tm1_standardized <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> u_tm1<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>sigma_tm1</span>
<span id="cb1-247">                </span>
<span id="cb1-248">                Psi_tilde_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.reshape(u_tm1_standardized, (<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>tf.reshape(u_tm1_standardized, (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims))</span>
<span id="cb1-249"></span>
<span id="cb1-250">                Q_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Qs_samp[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb1-251">                Q_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> A0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> A<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(Q_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> A0) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> B<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(Psi_tilde_tm1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> A0)</span>
<span id="cb1-252">                R_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.cov_to_corr(Q_t)</span>
<span id="cb1-253"></span>
<span id="cb1-254">                D_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.linalg.LinearOperatorDiag(sigma_t)</span>
<span id="cb1-255">                Sigma_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> D_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>R_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>D_t</span>
<span id="cb1-256"></span>
<span id="cb1-257"></span>
<span id="cb1-258">                sigmas_samp.append(sigma_t)</span>
<span id="cb1-259">                Qs_samp.append(Q_t)</span>
<span id="cb1-260">                </span>
<span id="cb1-261">                ynext <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfp.distributions.MultivariateNormalFullCovariance(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU, Sigma_t).sample()</span>
<span id="cb1-262">                ys_samp.append(tf.reshape(ynext,(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)))</span>
<span id="cb1-263">                sig_samp.append(tf.reshape(sigma_t,(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)))</span>
<span id="cb1-264">                R_samp.append(tf.reshape(R_t,(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.n_dims)))</span>
<span id="cb1-265">                </span>
<span id="cb1-266">                us_samp.append(ynext<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.MU)</span>
<span id="cb1-267">                </span>
<span id="cb1-268">            y_samples.append(tf.concat(ys_samp,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb1-269">            R_samples.append(tf.concat(R_samp,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb1-270">            sigma_samples.append(tf.concat(sig_samp,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb1-271">        </span>
<span id="cb1-272">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> tf.concat(y_samples,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).numpy(), tf.concat(R_samples,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).numpy(), tf.concat(sigma_samples,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).numpy()</span></code></pre></div>
</div>
<p>While the code is quite lengthy, its primary purpose is only twofold:</p>
<ol type="1">
<li><strong>Calculate the in-sample distribution</strong> (get_conditional_dists(…)) - is needed for optimization via <a href="https://en.wikipedia.org/wiki/Maximum_likelihood_estimation?ref=sarem-seitz.com">maximum likelihood</a>. This function calculates the likelihood values of each observation given the MGARCH model.</li>
<li><strong>Forecast the out-of sample distribution</strong> (sample_forecast(…)) - as the formulas for the model as a whole are quite complex, it’s difficult to calculate the forecast distributions in closed form. However, we can, more or less, easily sample from the target distribution. With a sufficiently large sample, we can estimate all relevant quantities of interest (e.g.&nbsp;forecast mean and quantiles).</li>
</ol>
<p>Notice that here, we have specified the conditional distribution as a multivariate Gaussian. Given the theory from above, this is nevertheless not a necessity. A multivariate T-distribution, for example, could work equally well or even better. Obviously, though, a Gaussian is always nice to work with.</p>
<p>Now, the remaining functions are basically just helpers to maintain some structure. I decided to not break down the key functions down further in order to keep the calculations in one place. If we were unit testing our model, it would actually be sensible to split things up into better testable units.</p>
<p>As we want to use the Keras API for training, we need to customize the training procedure (train_step(…)). Contrary to typical Keras use-cases, our training data is not split between input and output data. Rather, we only have one set of data, namely the time-series observations.</p>
<p>Finally, each training step needs to process all training observations at once. (<strong>no mini-batching</strong>). Also the observations must always remain in order (<strong>no shuffling</strong>).</p>
<p>This yields the following generic training loop:</p>
<p><code>model.fit(ts_data, ts_data, batch_size=len(ts_data), shuffle=False, epochs = 300, verbose=False)</code></p>
</section>
<section id="multivariate-garch-in-python---an-example" class="level2">
<h2 class="anchored" data-anchor-id="multivariate-garch-in-python---an-example">Multivariate GARCH in Python - an example</h2>
<p>We can now test our model on a simple example and see what happens. Given Python’s seamless interaction with <a href="https://finance.yahoo.com/?ref=sarem-seitz.com">Yahoo Finance</a>, we can pull some data for <strong>DAX</strong> and <strong>S&amp;P 500</strong>:</p>
<div id="cell-9" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> yfinance <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> yf</span>
<span id="cb2-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb2-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb2-4"></span>
<span id="cb2-5">data <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> yf.download(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^GDAXI ^GSPC"</span>, start<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2017-09-10"</span>, end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"2022-09-10"</span>, interval<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"1d"</span>)</span>
<span id="cb2-6"></span>
<span id="cb2-7">close <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Close"</span>]</span>
<span id="cb2-8">returns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(close).diff().dropna()</span>
<span id="cb2-9"></span>
<span id="cb2-10">fig, axs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">22</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb2-11"></span>
<span id="cb2-12"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>):</span>
<span id="cb2-13">    axs[i].plot(returns.iloc[:,i])</span>
<span id="cb2-14">    axs[i].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-15">    axs[i].margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb2-16">    axs[i].set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;"> - log-returns"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(returns.columns[i]),size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>)</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>[*********************100%***********************]  2 of 2 completed</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/multivariate-garch-with-python-and-tensorflow_files/figure-html/cell-3-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The typical volatility clusters are visible for both time-series. To see what happens with correlation between both stocks over time, we can plot the 60-day rolling correlation:</p>
<div id="cell-11" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb4-2"></span>
<span id="cb4-3">rolling_corrs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> returns.rolling(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">60</span>,min_periods<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>).corr()</span>
<span id="cb4-4">gdaxi_sp500_rollcorr <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> rolling_corrs[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^GDAXI"</span>][rolling_corrs.index.get_level_values(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"^GSPC"</span>]</span>
<span id="cb4-5"></span>
<span id="cb4-6">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">22</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>))</span>
<span id="cb4-7"></span>
<span id="cb4-8">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"60 day rolling correlation - DAX vs. S&amp;P500"</span>,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>)</span>
<span id="cb4-9">plt.plot(returns.index[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],gdaxi_sp500_rollcorr.values[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"60 day rolling correlation"</span>)</span>
<span id="cb4-10">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-11">plt.margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/multivariate-garch-with-python-and-tensorflow_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>It appears as if correlation between both indices has dropped since the beginning of the pandemic. Afterwards, correlation seems to fluctuate in cycles.</p>
<p>All in all, the pattern looks like a discretized version of an <a href="https://towardsdatascience.com/stochastic-processes-simulation-the-ornstein-uhlenbeck-process-e8bff820f3?ref=sarem-seitz.com">Ornstein-Uhlenbeck process</a>. The error correction formulation in our model should be able to capture this behaviour accordingly.</p>
<p>After splitting the data into train and test set (<strong>last 90 observations</strong>), we can fit the model. Then we take samples from the (<strong>90 days ahead</strong>) forecast distribution as follows (this takes some time):</p>
<div id="cell-13" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb5-2">tf.random.set_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb5-3"></span>
<span id="cb5-4"></span>
<span id="cb5-5">train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.float32(returns)[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>,:]</span>
<span id="cb5-6">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.float32(returns)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>:,:]</span>
<span id="cb5-7"></span>
<span id="cb5-8">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MGARCH_DCC(train)</span>
<span id="cb5-9"></span>
<span id="cb5-10">model.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">compile</span>(optimizer<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>tf.keras.optimizers.Adam(learning_rate<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-2</span>))</span>
<span id="cb5-11">model.fit(train, train, batch_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(train), shuffle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>, epochs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">300</span>, verbose<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span>)</span>
<span id="cb5-12"></span>
<span id="cb5-13">fcast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.sample_forecast(train,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>)</span></code></pre></div>
</div>
<p>Now, we are particularly interested in the conditional correlation fit and forecasts:</p>
<div id="cell-15" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> datetime <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> timedelta</span>
<span id="cb6-2"></span>
<span id="cb6-3">corrs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> fcast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>][:,:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb6-4">corr_means <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(corrs,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-5">corr_lowers <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(corrs,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-6">corr_uppers <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(corrs,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-7"></span>
<span id="cb6-8"></span>
<span id="cb6-9">conditional_dists <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model(np.float32(returns.values))</span>
<span id="cb6-10"></span>
<span id="cb6-11">conditional_correlations <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [model.cov_to_corr(conditional_dists.covariance()[i,:,:])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].numpy() <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(returns))]</span>
<span id="cb6-12"></span>
<span id="cb6-13">idx_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> returns[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>].index</span>
<span id="cb6-14">idx_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.date_range(returns[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>].index[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> timedelta(days<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), returns[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>].index[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> timedelta(days<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>))</span>
<span id="cb6-15"></span>
<span id="cb6-16">fig, axs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>), gridspec_kw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>{<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'height_ratios'</span>: [<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]})</span>
<span id="cb6-17"></span>
<span id="cb6-18">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional Correlation - DAX, S&amp;P500"</span>, size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>)</span>
<span id="cb6-19"></span>
<span id="cb6-20">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].axhline(np.corrcoef(returns.T)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>,ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Unconditional sample correlation"</span>)</span>
<span id="cb6-21"></span>
<span id="cb6-22">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].plot(idx_train[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],conditional_correlations[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MGARCH in-sample conditional correlation"</span>)</span>
<span id="cb6-23">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].plot(idx_test,conditional_correlations[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">90</span>:],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MGARCH out-of-sample conditional correlation"</span>)</span>
<span id="cb6-24"></span>
<span id="cb6-25">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].plot(idx_test, corr_means,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.9</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MGARCH correlation mean forecast"</span>)</span>
<span id="cb6-26">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].fill_between(idx_test, corr_lowers, corr_uppers, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MGARCH correlation 90</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb6-27"></span>
<span id="cb6-28">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb6-29">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].legend(prop <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"size"</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>})</span>
<span id="cb6-30">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb6-31"></span>
<span id="cb6-32"></span>
<span id="cb6-33">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Sanity check: Model predicted VS. rolling correlation"</span>,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>)</span>
<span id="cb6-34">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].plot(returns.index[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],gdaxi_sp500_rollcorr.values[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"60 day rolling correlation"</span>)</span>
<span id="cb6-35">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].plot(returns.index[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],conditional_correlations[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>:],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"MGARCH in-sample conditional correlation"</span>)</span>
<span id="cb6-36">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb6-37">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].legend(prop <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"size"</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>})</span>
<span id="cb6-38">axs[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].margins(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/multivariate-garch-with-python-and-tensorflow_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The forecasted correlation (blue) captures the actual correlation (red) under our model quite well. Obviously though, the true correlation is unknown. Nevertheless, our model matches the rolling correlation quite well, even out-of sample. This implies that our approach is - at least - not completely off.</p>
<p>Being able to reliably forecast correlations might be interesting for <a href="https://www.investopedia.com/terms/s/statisticalarbitrage.asp?ref=sarem-seitz.com">statistical arbitrage</a> strategies. While those strategies typically use price movements, correlations could be an interesting alternative.</p>
<p>From here, we could also look at price and volatility forecasts as well. To keep this article from becoming bloated, I’ll leave it to the interested reader to do this. You can find the relevant notebook <a href="https://github.com/SaremS/sample_notebooks/blob/master/Multivariate%20GARCH.ipynb?ref=sarem-seitz.com">here</a> - feel free to extend with your own experiments.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Today, we took a look at multivariate extensions to GARCH-type models. While a ‘naive’ extension is quite straightforward, we need to be careful not to overparameterize our model. Luckily, there already exists research on useful specifications that mostly avoid this issue.</p>
<p>For deeper insights, it is likely interesting to consider non-linear extensions to this approach. The trade-off between overfitting and flexibility will possible be even more relevant here. If you want to head into that direction, you might want to have a look at <a href="https://scholar.google.com/scholar?hl=de&amp;as_sdt=0%2C5&amp;q=multivariate+garch+nonlinear&amp;btnG=&amp;ref=sarem-seitz.com">some results from Google Scholar</a>.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Bollerslev, Tim. Modelling the coherence in short-run nominal exchange rates: a multivariate generalized ARCH model. The review of economics and statistics, 1990, p.&nbsp;498-505.</p>
<p><strong>[2]</strong> Engle, Robert. Dynamic conditional correlation: A simple class of multivariate generalized autoregressive conditional heteroskedasticity models. Journal of Business &amp; Economic Statistics 20.3, 2002, p.&nbsp;339-350.</p>
<p><strong>[3]</strong> Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science &amp; Business Media, 2005.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <category>Tensorflow</category>
  <guid>https://www.sarem-seitz.com/posts/multivariate-garch-with-python-and-tensorflow.html</guid>
  <pubDate>Sun, 11 Sep 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Cointegrated time-series and when differencing might be bad</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/cointegrated-time-series-and-when-difference-transformations-might-be-bad.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>A standard method in the time-series analysis toolkit are <a href="https://machinelearningmastery.com/remove-trends-seasonality-difference-transform-python/?ref=sarem-seitz.com">difference transformations or <em>differencing</em></a>. Despite being dead simple, differencing can be quite powerful. <a href="https://www.sarem-seitz.com/facebook-prophet-covid-and-why-i-dont-trust-the-prophet/">In fact, it allows us to outperform sophisticated time-series models with what is almost a bare white noise process</a>.</p>
<p>Due to its simplicity, differencing is quite popular whenever some unit-root test is significant. While this is fairly safe in the univariate case, things look differently for multivariate time-series.</p>
<p>Let us demonstrate this with a simple example:</p>
</section>
<section id="a-motivating-time-series-example" class="level2">
<h2 class="anchored" data-anchor-id="a-motivating-time-series-example">A motivating time-series example</h2>
<p>To exemplify the underlying issue, I created an artificial, two-dimensional and linear time-series:</p>
<div id="cell-4" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"></span>
<span id="cb1-4">A <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array([[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>],[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>]])</span>
<span id="cb1-5">B <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array([[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.9</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>]])</span>
<span id="cb1-6"></span>
<span id="cb1-7">Atilde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> A<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>B.T</span>
<span id="cb1-8"></span>
<span id="cb1-9"></span>
<span id="cb1-10">sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array([[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>]])</span>
<span id="cb1-11"></span>
<span id="cb1-12">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb1-13"></span>
<span id="cb1-14">ys <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.random.normal(size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma]</span>
<span id="cb1-15"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>):    </span>
<span id="cb1-16">    dy <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Atilde<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>ys[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> np.random.normal(size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigma</span>
<span id="cb1-17">    ys.append(ys[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> dy)</span>
<span id="cb1-18"> </span>
<span id="cb1-19">Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate(ys,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).T[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:,:]</span>
<span id="cb1-20"></span>
<span id="cb1-21">Ytrain <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Y[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>,:]</span>
<span id="cb1-22">Ytest <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>:,:]</span>
<span id="cb1-23"></span>
<span id="cb1-24">forecast_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytrain),<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytrain)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytest))</span>
<span id="cb1-25"></span>
<span id="cb1-26">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb1-27">plt.plot(Ytrain[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 1 - Train set"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-28">plt.plot(Ytrain[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 2 - Train set"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb1-29">plt.plot(forecast_range, Ytest[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 1 - Test set"</span>)</span>
<span id="cb1-30">plt.plot(forecast_range, Ytest[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 2 - Test set"</span>)</span>
<span id="cb1-31"></span>
<span id="cb1-32">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-33">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/cointegrated-time-series-and-when-difference-transformations-might-be-bad_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>There seems to be some connection between both time-series but that might obviously just be a <a href="https://en.wikipedia.org/wiki/Spurious_relationship?ref=sarem-seitz.com">spurious one</a> over time. The next step that you often see done in this setting is to test for unit-roots in both time-series.</p>
<p>An <a href="https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.adfuller.html?ref=sarem-seitz.com">Augmented-Dickey Fuller test from statsmodels</a> shows significance scores of 0.8171 and 0.8512. This underlines the visible unit-roots in both time-series. Thus, the difference transformation appears to be the logical next step. Let’s do that for the train set to forecast the test set further down the line:</p>
<div id="cell-6" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1">Ytrain_diff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Ytrain[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:,:]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>Ytrain[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,:]</span>
<span id="cb2-2">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb2-3">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].plot(Ytrain_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-4">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-5">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time Series 1 - Train set differenced"</span>)</span>
<span id="cb2-6"></span>
<span id="cb2-7">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].plot(Ytrain_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-8">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb2-9">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].set_title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time Series 2 - Train set differenced"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="2">
<pre><code>Text(0.5, 1.0, 'Time Series 2 - Train set differenced')</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/cointegrated-time-series-and-when-difference-transformations-might-be-bad_files/figure-html/cell-3-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Next, we can check forecast performance for two VAR(1) models - one trained on the original time-series and one on the transformed one:</p>
<div id="cell-8" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> statsmodels.tsa.api <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> VAR</span>
<span id="cb4-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> norm</span>
<span id="cb4-3"></span>
<span id="cb4-4">model_nodiff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VAR(Ytrain).fit(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,trend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'n'</span>)</span>
<span id="cb4-5"></span>
<span id="cb4-6">pred_mean_nodiff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model_nodiff.forecast(Ytrain,steps<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytest))</span>
<span id="cb4-7">pred_std_nodiff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.array(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: np.diag(x),<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(model_nodiff.forecast_cov(steps<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytest)))))))</span>
<span id="cb4-8">pred_lower_nodiff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> norm(pred_mean_nodiff,pred_std_nodiff).ppf(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>)</span>
<span id="cb4-9">pred_upper_nodiff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> norm(pred_mean_nodiff,pred_std_nodiff).ppf(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.975</span>)</span>
<span id="cb4-10"></span>
<span id="cb4-11"></span>
<span id="cb4-12">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb4-13"></span>
<span id="cb4-14">plt.plot(Ytrain[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-15">plt.plot(Ytrain[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-16">plt.plot(forecast_range, Ytest[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>)</span>
<span id="cb4-17">plt.plot(forecast_range, Ytest[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>)</span>
<span id="cb4-18"></span>
<span id="cb4-19">plt.plot(forecast_range, pred_mean_nodiff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 1 - Point forecast"</span>)</span>
<span id="cb4-20">plt.plot(forecast_range, pred_mean_nodiff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 2 - Point forecast"</span>)</span>
<span id="cb4-21">plt.fill_between(forecast_range, pred_lower_nodiff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], pred_upper_nodiff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 1 - 95</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb4-22">plt.fill_between(forecast_range, pred_lower_nodiff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], pred_upper_nodiff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 2 - 95</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb4-23"></span>
<span id="cb4-24">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-25">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/cointegrated-time-series-and-when-difference-transformations-might-be-bad_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The summed MSE over both time-series forecasts is at <code>0.3463</code>. Clearly, the model with training data differenced should perform better:</p>
<div id="cell-10" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">model_diff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> VAR(Ytrain_diff).fit(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,trend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">'n'</span>)</span>
<span id="cb5-2"></span>
<span id="cb5-3">pred_mean_diff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Ytrain[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(model_diff.forecast(Ytrain_diff,steps<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytest)),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)</span>
<span id="cb5-4">pred_std_diff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.cumsum(np.array(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: np.diag(x),<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(model_diff.forecast_cov(steps<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(Ytest)))))),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>))</span>
<span id="cb5-5">pred_lower_diff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> norm(pred_mean_diff,pred_std_diff).ppf(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>)</span>
<span id="cb5-6">pred_upper_diff <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> norm(pred_mean_diff,pred_std_diff).ppf(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.975</span>)</span>
<span id="cb5-7"></span>
<span id="cb5-8"></span>
<span id="cb5-9">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb5-10"></span>
<span id="cb5-11">plt.plot(Ytrain[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-12">plt.plot(Ytrain[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-13">plt.plot(forecast_range, Ytest[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>)</span>
<span id="cb5-14">plt.plot(forecast_range, Ytest[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>)</span>
<span id="cb5-15"></span>
<span id="cb5-16">plt.plot(forecast_range, pred_mean_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 1 - Point forecast"</span>)</span>
<span id="cb5-17">plt.plot(forecast_range, pred_mean_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,ls<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 2 - Point forecast"</span>)</span>
<span id="cb5-18">plt.fill_between(forecast_range, pred_lower_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], pred_upper_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 1 - 95</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb5-19">plt.fill_between(forecast_range, pred_lower_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], pred_upper_diff[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Time-Series 2 - 95</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">% f</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">orecast interval"</span>)</span>
<span id="cb5-20"></span>
<span id="cb5-21">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb5-22">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/cointegrated-time-series-and-when-difference-transformations-might-be-bad_files/figure-html/cell-5-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This time, the summed MSE is <code>0.5105</code> - approximately 50% higher. Also, the forecast interval for time-series 1 is much larger than without any differencing. Something seems to be off with the popular difference transformation.</p>
</section>
<section id="why-cointegration-matters" class="level2">
<h2 class="anchored" data-anchor-id="why-cointegration-matters">Why cointegration matters</h2>
<p>Right now, you might - rightfully - argue that the underperformance of the differencing model was due to pure chance. Indeed, we would need much broader experiments to verify our initial claim empirically.</p>
<p>It is, however, possible to actually prove why differencing can be bad for multivariate time-series analysis. To do so, let us take a step back to univariate time-series models and why difference transformations work here.</p>
<p>We will only look at AR(1) and VAR(1) time-series for simplicity. All results can be shown to hold for higher-order AR/VAR, too.</p>
<section id="unit-root-ar1-time-series---when-differencing-is-likely-safe" class="level3">
<h3 class="anchored" data-anchor-id="unit-root-ar1-time-series---when-differencing-is-likely-safe">Unit-Root AR(1) time-series - when differencing is likely safe</h3>
<p>Mathematically, an AR(1) time-series looks as follows: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0Ay_t%20&amp;%20=%5Cphi_1%20y_%7Bt-1%7D+%5Cepsilon_t%20%5C%5C%0A%5Cepsilon_t%20&amp;%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5Csigma%5E2%5Cright)%0A%5Cend%7Baligned%7D%0A"> In order for differencing to make sense, we need the time-series to have a unit root. This is the case when solution of characteristic polynomial <img src="https://latex.codecogs.com/png.latex?%0A1-%5Cphi_1%20z=0%0A"> lies on the unit-circle, i.e. <img src="https://latex.codecogs.com/png.latex?%0Az=1.%0A"> The only choice for the AR-parameter is therefore <img src="https://latex.codecogs.com/png.latex?%0A%5Cphi_1=1%0A"> and thus <img src="https://latex.codecogs.com/png.latex?%0Ay_t=y_%7Bt-1%7D+%5Cepsilon_t.%0A"> To make this equation stationary, we subtract the lagged variable from both sides: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t-y_%7Bt-1%7D=y_%7Bt-1%7D-y_%7Bt-1%7D+%5Cepsilon_t%20%5C%5C%0A%5CRightarrow%20%5CDelta%20y_t=%5Cepsilon_t%0A%5Cend%7Bgathered%7D%0A"> Clearly, the best possible forecast now is to predict white noise. Keep in mind that we could equally well fit a model on the untransformed variable. However, the differenced time-series directly uncovers the lack of any truly autoregressive component.</p>
<p>On the one hand, differencing is clearly a good choice in univariate time-series with unit-roots. Things are not as simple for multivariate time-series, though.</p>
</section>
<section id="multivariate-time-series-with-cointegration" class="level3">
<h3 class="anchored" data-anchor-id="multivariate-time-series-with-cointegration">Multivariate time-series with cointegration</h3>
<p>Consider now a VAR(1) time where we replace the scalars in the AR(1) model with vectors (bold, lower-case) and vectors (upper case): <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D%7D=A%20%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D-%5Cmathbf%7B1%7D%7D+%5Cmathbf%7Bu%7D_%7B%5Cmathbf%7Bt%7D%7D%20%5C%5C%0A%5Cmathbf%7Bu%7D_%7B%5Cmathbf%7Bt%7D%7D%20%5Csim%20%5Cmathcal%7BN%7D(%5Cmathbf%7B0%7D,%20%5CSigma)%0A%5Cend%7Bgathered%7D%0A"> A unit-root in a VAR(1) time-series imply, similarly to the AR(1) case, that <img src="https://latex.codecogs.com/png.latex?%0A%5Coperatorname%7Bdet%7D(I-A)=0.%0A"> In the trivial case, the autoregression parameter is the identity matrix. This implies that the marginals in our VAR(1) time-series are all independent and unit-root. If we exclude this case and proceed as for AR(1), we get <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0A%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D%7D-%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D-%5Cmathbf%7B1%7D%7D=A%20%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D-%5Cmathbf%7B1%7D%7D-%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D-%5Cmathbf%7B1%7D%7D+%5Cmathbf%7Bu%7D_%7B%5Cmathbf%7Bt%7D%7D%20%5C%5C%0A%5CRightarrow%20%5CDelta%20%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D%7D=-(I-A)%20%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D-%5Cmathbf%7B1%7D%7D+%5Cmathbf%7Bu%7D_%7B%5Cmathbf%7Bt%7D%7D%20%5C%5C%0A%5CRightarrow%20%5CDelta%20%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D%7D=%5Ctilde%7BA%7D%20%5Cmathbf%7By%7D_%7B%5Cmathbf%7Bt%7D-%5Cmathbf%7B1%7D%7D+%5Cmathbf%7Bu%7D_%7B%5Cmathbf%7Bt%7D%7D%0A%5Cend%7Bgathered%7D%0A"> The last line is also called an <a href="https://en.wikipedia.org/wiki/Error_correction_model?ref=sarem-seitz.com">Vector Error Correcting Representation</a> of a VAR time-series. If you scroll back to our simulation, this is the exact formula that was used to generate the time-series.</p>
<p>By making Atilde rank-deficient, the time-series becomes cointegrated, as explained by <a href="https://www.google.com/books/edition/New_Introduction_to_Multiple_Time_Series/muorJ6FHIiEC?hl=de&amp;gbpv=1&amp;dq=l%C3%BCtkepohl&amp;pg=PA249&amp;printsec=frontcover&amp;ref=sarem-seitz.com">Lütkepohl</a>. There exists another, <a href="https://www.youtube.com/watch?v=vvTKjm94Ars&amp;ab_channel=BenLambert&amp;ref=sarem-seitz.com">broader definition of cointegration</a> but we won’t cover that today.</p>
<p>Clearly, a cointegrated VAR(1) time-series differs from the univariate AR(1) case. Even after differencing, the transformed values depend on the past of the original time-series. We would therefore lose important information if we don’t account for the original time-series anymore.</p>
<p>If you are working with multivariate data, you should therefore not just blindly apply differencing.</p>
</section>
</section>
<section id="how-to-deal-with-cointegration" class="level2">
<h2 class="anchored" data-anchor-id="how-to-deal-with-cointegration">How to deal with cointegration</h2>
<p>The above result begs the question of what we should do to handle cointegration. Typically, time-series analysis is concerned either with forecasting or inference. Therefore, two different approaches come to mind:</p>
<p><strong>Cross-validation and backtesting</strong> - the pragmatic, ‘data sciency’ approach. If our goal is primarily to build the most accurate forecast, we don’t necessarily need to detect cointegration at all. As long as the resulting model is performant and reliable, nearly anything goes.</p>
<p>As usually, the ‘best’ model can be selected based on cross-validation and out-of-sample performance tests. The primary implication from cointegration is then to apply differencing with some care.</p>
<p>On the other hand, the above result also suggests that adding the original time-series as a feature might be a good idea in general.</p>
<p><strong>Statistical tests</strong> - the classical statistics way. Obviously, cointegration is nothing new to econometricians and statisticians. If you are interested in learning about the generating process itself, this approach is likely more expedient.</p>
<p>Luckily, the <a href="https://scholar.google.com/scholar?hl=de&amp;as_sdt=0%2C5&amp;q=mackinnon+cointegration&amp;btnG=&amp;oq=mackinnon+coint&amp;ref=sarem-seitz.com">work of James MacKinnon</a> provides extensive insights into tests for cointegration. Other popular cointegration tests have been developed by <a href="https://scholar.google.com/scholar?hl=com&amp;as_sdt=0%2C5&amp;q=engle+granger+cointegration+test&amp;btnG=&amp;oq=engle+gran&amp;ref=sarem-seitz.com">Engle and Granger</a> and <a href="http://jerrydwyer.com/pdf/Clemson/Cointegration.pdf?ref=sarem-seitz.com">Søren Johansen</a>.</p>
<p>In Python, you can find the MacKinnon test in the <a href="https://www.statsmodels.org/dev/generated/statsmodels.tsa.stattools.coint.html?ref=sarem-seitz.com">statsmodels library</a>. For the above time-series, the test yields a p-value of almost zero.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Hopefully, this article was an eye-opener to you to not just difference every time-series straight ahead. You should be aware by now that cointegration is a peculiarity of multivariate time-series that needs to be treated with care.</p>
<p>Keep in mind that standard cointegration is concerned with linear time-series only. Once non-linear dynamics are present, things could become even more messy and differencing might be even less suitable.</p>
<p>Indeed, there exists some <a href="https://www.tandfonline.com/doi/full/10.1080/07474938.2020.1771900?ref=sarem-seitz.com">recent research on non-linear cointegration</a>. You might want to take a look at it for further details.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Engle, Robert F.; Granger, Clive WJ. Co-integration and error correction: representation, estimation, and testing. Econometrica: journal of the Econometric Society, 1987, p.&nbsp;251-276.</p>
<p><strong>[2]</strong> Hamilton, James Douglas. Time series analysis. Princeton university press, 2020.</p>
<p><strong>[3]</strong> Lütkepohl, Helmut. New introduction to multiple time series analysis. Springer Science &amp; Business Media, 2005.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/cointegrated-time-series-and-when-difference-transformations-might-be-bad.html</guid>
  <pubDate>Thu, 25 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Facebook Prophet, Covid and why I don’t trust the Prophet</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>Facebook Prophet is arguably one of the most widely known tools for time-series forecasting and related tasks. Ask any data scientist who works with time-series data if they know Prophet and the answer is likely either a yes or an annoyed yes.</p>
<p>After all, Facebook Prophet has become quite a controversial tool for time-series problems. Some people don’t want to work without it anymore, <a href="https://medium.com/@valeman/benchmarking-facebook-prophet-53273c3ee9c6?ref=sarem-seitz.com">others clearly hate it</a>.</p>
<p>However, whether you like it or not, <a href="https://github.com/facebook/prophet/issues/1416?ref=sarem-seitz.com">Prophet users seem to face considerable challenges when it comes to modelling the Covid-19 shock</a>. While, by now, people have found workaround, I’d argue that these issues are caused by a deeper problem with Facebook Prophet:</p>
</section>
<section id="the-fundamental-flaw-of-facebook-prophet" class="level2">
<h2 class="anchored" data-anchor-id="the-fundamental-flaw-of-facebook-prophet">The fundamental flaw of Facebook Prophet</h2>
<p>The most problematic aspect of Facebook Prophet is that it reduces time-series modelling to a curve-fitting task. Other approaches make auto-regressive dynamics a fundamental assumption. Prophet, on the other hand, merely tries to draw a least-error line through time against your data.</p>
<p>More technically, the evolution of almost all dynamical systems depend on past realizations. We can write this as follows: <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(y_t,%20y_%7Bt-1%7D,%20%5Cldots,%20y_1%5Cright)=p%5Cleft(y_t%20%5Cmid%20y_%7Bt-1%7D%5Cright)%20p%5Cleft(y_%7Bt-1%7D%20%5Cmid%20y_%7Bt-2%7D%5Cright)%20%5Ccdots%20p%5Cleft(y_1%5Cright)%0A"> Where we only consider dependence on the last observation and no hidden states.</p>
<p>Most time-series models focus on the right-hand side. Facebook Prophet, however, is concerned with the left-hand side of the equation. Even worse, Prophet implicitly makes the following assumptions on top (see <a href="https://peerj.com/preprints/3190/?ref=sarem-seitz.com">Facebook Prophet paper</a>, page 14, for reference): <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap%5Cleft(y_t,%20y_%7Bt-1%7D,%20%5Cldots,%20y_t%5Cright)%20%5C%5C%0A=%5Cmathcal%7BN%7D%5Cleft(y_t%20%5Cmid%20m(t),%20%5Csigma%5E2%5Cright)%20%5Ccdot%20%5Cmathcal%7BN%7D%5Cleft(y_%7Bt-1%7D%20%5Cmid%20m(t-1),%20%5Csigma%5E2%5Cright)%20%5Ccdots%20%5Cmathcal%7BN%7D%5Cleft(y_1%20%5Cmid%20m(1),%20%5Csigma%5E2%5Cright)%0A%5Cend%7Bgathered%7D%0A"> This is problematic for at least three reasons:</p>
<ol type="1">
<li><strong>Dependence on past realizations is completely ignored</strong>. In the real-world, a single, large shock will quickly change the whole future trajectory of the time-series. This can trivially be accounted for by a dynamical model but not by Facebook Prophet.</li>
<li><strong>The mean function needs to extrapolate outside the range of observed values</strong>. The way that Prophet frames the modelling problem inevitably leads to the problem of out-of-distribution generalization. All your future t’s will lie outside your training domain by design.</li>
<li><strong>Variance is presumed to be constant</strong>. Related to 1. - if random shocks have impact on the future, variance - as a measure of uncertainty - should grow as we are forecasting further ahead.</li>
</ol>
<p>As a general rule of thumb: If the forecast intervals of your model do not grow over time, something is likely wrong with it. Unless you know exactly what you are doing, you should consider an alternative.</p>
</section>
<section id="facebook-prophet-underperforms-against-a-toy-benchmark" class="level2">
<h2 class="anchored" data-anchor-id="facebook-prophet-underperforms-against-a-toy-benchmark">Facebook Prophet underperforms against a toy benchmark</h2>
<p>To exemplify the above, I ran a pretty simple forecasting benchmark on German economic data. While the example is a little artificial and too small to generalize, the implications should be clear.</p>
<p>I used the following dataset: Retail sale in non-specialised stores (ex. food) - Jan 2012 - May 2022 (monthly; available [here]). The train set consists of all data from Jan 2021 to Dec 2019; the test set uses all data from Jan 2020 - May 2022:</p>
<div id="cell-5" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> datetime <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> date</span>
<span id="cb1-5"></span>
<span id="cb1-6"></span>
<span id="cb1-7">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.read_excel(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/45212-0004.xlsx"</span>)</span>
<span id="cb1-8">ts <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">804</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">937</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>].replace(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"..."</span>,np.nan).dropna()</span>
<span id="cb1-9">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([pd.Series(pd.date_range(date(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2012</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),date(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2022</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),freq<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"M"</span>)),ts.reset_index(drop<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-10">df.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ds"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>]</span>
<span id="cb1-11">df.index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ds"</span>]</span>
<span id="cb1-12"></span>
<span id="cb1-13">df_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">96</span>,:]</span>
<span id="cb1-14">df_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.iloc[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">96</span>:,:]</span>
<span id="cb1-15"></span>
<span id="cb1-16">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">14</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb1-17">plt.plot(df_train[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train"</span>)</span>
<span id="cb1-18">plt.plot(df_test[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test"</span>)</span>
<span id="cb1-19">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-20">plt.legend()</span>
<span id="cb1-21">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Germany, Retail sale in non-specialised stores (ex. food) - Jan 2012 - May 2022"</span>)</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/openpyxl/styles/stylesheet.py:226: UserWarning: Workbook contains no default style, apply openpyxl's default
  warn("Workbook contains no default style, apply openpyxl's default")
/var/folders/2d/hl2cr85d2pb2kfbmsng3267c0000gn/T/ipykernel_63305/4068186796.py:9: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  df = pd.concat([pd.Series(pd.date_range(date(2012,1,1),date(2022,6,1),freq="M")),ts.reset_index(drop=True)],1)</code></pre>
</div>
<div class="cell-output cell-output-display" data-execution_count="3">
<pre><code>Text(0.5, 1.0, 'Germany, Retail sale in non-specialised stores (ex. food) - Jan 2012 - May 2022')</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet_files/figure-html/cell-2-output-3.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>A reasonable forecasting model should be able to anticipate at least the possibility of random shocks. This would usually be visible by increasing forecast intervals. After all, the further we look ahead, the more opportunities for high impact events.</p>
<p>In this case, the time-series does not go completely bonkers after the shock from Corona. Thus, Facebook Prophet should not struggle too much here. Let’s see how it does:</p>
<section id="a-simple-prophet-forecast" class="level3">
<h3 class="anchored" data-anchor-id="a-simple-prophet-forecast">A simple Prophet forecast</h3>
<div id="cell-7" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> prophet <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Prophet</span>
<span id="cb4-2"></span>
<span id="cb4-3">m <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Prophet()</span>
<span id="cb4-4">m.fit(df_train)</span>
<span id="cb4-5"></span>
<span id="cb4-6">prph_pred <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> m.predict(df_test)</span>
<span id="cb4-7"></span>
<span id="cb4-8">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">14</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb4-9">plt.plot(df_train[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train"</span>)</span>
<span id="cb4-10">plt.plot(df_test[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test"</span>)</span>
<span id="cb4-11">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb4-12">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Germany, Retail sale in non-specialised stores (ex. food) - Jan 2012 - May 2022"</span>)</span>
<span id="cb4-13"></span>
<span id="cb4-14">plt.plot(df_test.index,prph_pred[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yhat"</span>],label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Prophet Forecast"</span>)</span>
<span id="cb4-15">plt.fill_between(df_test.index,prph_pred[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yhat_lower"</span>],prph_pred[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yhat_upper"</span>],alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>)</span>
<span id="cb4-16"></span>
<span id="cb4-17">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>18:21:49 - cmdstanpy - INFO - Chain [1] start processing
18:21:50 - cmdstanpy - INFO - Chain [1] done processing</code></pre>
</div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet_files/figure-html/cell-3-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>For the mean forecast Prophet was able to reasonably predict out-of-sample - at least to some extent. The forecast intervals, however, are completely ludicrous. Our model simply took roughly the residuals from the in-sample data and projected their intervals into the future.</p>
<p>This clearly shows that Facebook Prophet did not really learn the inherent dynamics but merely a function of time. If the impact of Covid on the underlying dynamics were worse, we would likely not even see reasonable point forecasts either. I am sure that there are many data scientists out there where this was exactly the case.</p>
</section>
<section id="an-even-simpler-forecast-model" class="level3">
<h3 class="anchored" data-anchor-id="an-even-simpler-forecast-model">An even simpler forecast model</h3>
<p>As you might have heard by now, Prophet does not learn anything about the underlying system dynamics. Thus, our goal is to now create a competitor that is <strong>a)</strong> very simple and <strong>b)</strong> capable of actually modelling the dynamics.</p>
<p>From the time-series plot, we see that there is a clear yearly seasonality. After removing that via <a href="https://faculty.fuqua.duke.edu/~rnau/Decision411_2007/Class10notes.htm?ref=sarem-seitz.com">seasonal differencing</a>, I saw that there was a <a href="https://en.wikipedia.org/wiki/Order_of_integration?ref=sarem-seitz.com#:~:text=A%20time%20series%20is%20integrated,times%20yields%20a%20stationary%20process.">remaining integration component</a> that I removed via another round of first-order differencing.</p>
<p>Obviously, this is not a full diagnostic but sufficient for our simple toy example. Also, since the time-series is non-negative, I initially transformed it by taking the square-root. This ensures that the re-transformed series will be non-negative as well.</p>
<p>All the above leads us to the following, relatively simple model: <img src="https://latex.codecogs.com/png.latex?%0A%5CDelta%20%5CDelta_%7B12%7D%20%5Csqrt%7By_t%7D%20%5Csim%20p(%5Cepsilon)%0A"> In summary, we assume that, after ‘square-rooting’ and differencing, only a noise term remains. Here, we even assume that the noise distribution stays constant over time. A more sophisticated model should obviously check for time-varying noise.</p>
<p>The only thing that our model now needs to learn is the distribution of the noise. Afterwards, we draw noise samples and re-integrate (i.e.&nbsp;re-transform the differencing operations). Finally, we estimate point and interval forecasts.</p>
<p>To learn the noise distribution, I used scipy’s <code>gaussian_kde</code> function. This fits a <a href="https://en.wikipedia.org/wiki/Kernel_density_estimation?ref=sarem-seitz.com">Gaussian kernel density</a> estimator to the data. We can then use this estimate to draw noise samples:</p>
<div id="cell-9" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> scipy.stats <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> gaussian_kde</span>
<span id="cb6-2"></span>
<span id="cb6-3"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#reverting the order of differencing yields the same result but makes re-transformation easier</span></span>
<span id="cb6-4">diffed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(df_train[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>]).diff().dropna()</span>
<span id="cb6-5">diffed_s <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> diffed.diff(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>).dropna()</span>
<span id="cb6-6"></span>
<span id="cb6-7">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb6-8">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb6-9">plt.hist(diffed_s,bins<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>,density <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Histogram of diffed time-series"</span>)</span>
<span id="cb6-10"></span>
<span id="cb6-11">kde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> gaussian_kde(diffed_s)</span>
<span id="cb6-12"></span>
<span id="cb6-13">target_range <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.linspace(np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(diffed_s)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,np.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(diffed_s)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,num<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)</span>
<span id="cb6-14"></span>
<span id="cb6-15">plt.plot(target_range, kde.pdf(target_range),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gaussian Kernel Density of diffed time-series"</span>)</span>
<span id="cb6-16"></span>
<span id="cb6-17">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Next, we draw samples and re-transform them into a forecast of our original time-series:</p>
<div id="cell-11" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">321</span>)</span>
<span id="cb7-2"></span>
<span id="cb7-3">full_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [] </span>
<span id="cb7-4"></span>
<span id="cb7-5"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>):</span>
<span id="cb7-6">    draw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> kde.resample(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(df_test)).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-7">    result <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(diffed.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:].values)</span>
<span id="cb7-8"></span>
<span id="cb7-9">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(df_test)):</span>
<span id="cb7-10">        result.append(result[t]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>draw[t])</span>
<span id="cb7-11"></span>
<span id="cb7-12">    full_sample.append(np.array((np.sqrt(df_train.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>][<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(result[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:]))).reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb7-13"></span>
<span id="cb7-14"></span>
<span id="cb7-15">reshaped <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.concatenate(full_sample,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-16">result_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.mean(reshaped,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-17">lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(reshaped,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-18">upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.quantile(reshaped,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-19"></span>
<span id="cb7-20">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">14</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb7-21">plt.plot(df_train[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train"</span>)</span>
<span id="cb7-22">plt.plot(df_test[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test"</span>)</span>
<span id="cb7-23">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb7-24">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Germany, Retail sale in non-specialised stores (ex. food) - Jan 2012 - May 2022 + Forecast"</span>)</span>
<span id="cb7-25"></span>
<span id="cb7-26">plt.plot(df_test.index, result_mean,label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Toy model forecast"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb7-27">plt.fill_between(df_test.index,lower,upper,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb7-28"></span>
<span id="cb7-29"></span>
<span id="cb7-30">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet_files/figure-html/cell-5-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Especially the forecast interval makes much more sense than for Facebook Prophet. As desired, the forecast intervals grow larger over time which implies increasing uncertainty.</p>
<p>Let us also do a side-by-side comparison:</p>
<div id="cell-13" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb8-1">plt.figure(figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">14</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb8-2">plt.plot(df_train[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Train"</span>)</span>
<span id="cb8-3">plt.plot(df_test[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Test"</span>)</span>
<span id="cb8-4">plt.grid(alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb8-5">plt.title(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Germany, Retail sale in non-specialised stores (ex. food) - Jan 2012 - May 2022 + Forecast"</span>)</span>
<span id="cb8-6"></span>
<span id="cb8-7">plt.plot(df_test.index, result_mean,label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Toy model forecast"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb8-8">plt.fill_between(df_test.index,lower,upper,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb8-9"></span>
<span id="cb8-10">plt.plot(df_test.index,prph_pred[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yhat"</span>],label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Prophet Forecast"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>)</span>
<span id="cb8-11"></span>
<span id="cb8-12">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>We can also calculate the RMSE for both mean forecasts:</p>
<div id="cell-15" class="cell" data-execution_count="14">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1">rmse_simple <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((result_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>df_test[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>].values)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-2">rmse_prophet <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.sqrt(np.mean((prph_pred[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"yhat"</span>].values[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(df_test):]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>df_test[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"y"</span>].values)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb9-3"></span>
<span id="cb9-4"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Simple Model: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(rmse_simple))</span>
<span id="cb9-5"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">print</span>(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Prophet: </span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(rmse_prophet))</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>Simple Model: 20.745456460849358
Prophet: 22.49447614293072</code></pre>
</div>
</div>
</section>
</section>
<section id="takeaways-for-the-dedicated-prophet-user" class="level2">
<h2 class="anchored" data-anchor-id="takeaways-for-the-dedicated-prophet-user">Takeaways for the dedicated Prophet user</h2>
<p>The question is, what do we make out of this? Clearly, the large, long-term user-base of Facebook Prophet indicates that people are getting some value from it. Also, Facebook/Meta employs some very bright people - it’s highly unlikely that they would produce a completely useless library.</p>
<p>Going back to our initial considerations, we can deduce the following:</p>
<p>Prophet should work fine as long as it correctly depicts the conditional mean and the conditional variance. Mathematically,</p>
<p><img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Am_%7B%5Ctext%20%7Bprophet%20%7D%7D(t+h)%20%5Capprox%20%5Cmathbb%7BE%7D%5Cleft%5By_%7Bt+h%7D%20%5Cmid%20y_t,%20%5Cldots,%20y_1%5Cright%5D%20%5C%5C%0Av_%7B%5Ctext%20%7Bprophet%20%7D%7D(t+h)=%5Csigma%5E2%20%5Capprox%20%5Cmathbb%7BV%7D%20%5Coperatorname%7BVr%7D%5Cleft%5By_%7Bt+h%7D%20%5Cmid%20y_t,%20%5Cldots,%20y_1%5Cright%5D%0A%5Cend%7Bgathered%7D%0A"> for all forecast periods <img src="https://latex.codecogs.com/png.latex?t+h">.</p>
<p>This could be the case when the underlying system is in a nice, equilibrium, e.g.&nbsp;when the economy is in a non-volatile state. However, as soon as there is a large shock, the variance requirement is almost certain to be broken. This is exactly what we saw in the example time-series above.</p>
<p>Thus, should you drop Prophet altogether? If the results are good <strong>and</strong> if your forecasts going completely nuts does not have a large negative impact, I’d argue that you should keep it. Never change a running system, at least not over night.</p>
<p>If you are heavily dependent on a model that can randomly break at any time, though, you might want to start looking for alternatives.</p>
<p>Another use-case where Facebook Prophet makes more sense in my opinion, is outlier and change point detection. If you are simply interested in deviations from the expected trajectory, prophet can score you some quick and easy wins. As soon as forecast quality becomes a thing, however, you should be careful.</p>
<p>Will [Neural Prophet], a.k.a. Facebook Prophet 2.0, make things better? At least their Deep-AR module now considers past realizations to predict the future. On the other hand, Neural Prophet still makes heavy use of curve fitting. Thus, you should be wary of the Prophet upgrade, too.</p>
<p>If you decide that you are going to use either of the Prophets, I recommend benchmarking against trivial alternatives. When a simple but theoretically more sound model - as in our example - performs comparably, you might want to reconsider your choice.</p>
</section>
<section id="alternatives-to-facebook-prophet" class="level2">
<h2 class="anchored" data-anchor-id="alternatives-to-facebook-prophet">Alternatives to Facebook Prophet</h2>
<p>No mini-rant without trying to offer a solution. On the one hand, these alternatives will require more manual work to find a suitable model. On the other hand, chances are good that the product will be more robust than a convenient Prophet().fit().</p>
<ul>
<li><a href="https://facebookresearch.github.io/Kats/?ref=sarem-seitz.com">Kats</a>: While Kats is a broad library for general time-series analysis, it offers some endpoints for forecasting as well. Just like Prophet, it has also been open sources by Facebook/Meta.</li>
<li><a href="https://unit8co.github.io/darts/?ref=sarem-seitz.com#">Darts</a>: Specifically aimed at forecasting problems. Darts provides support for a variety of modelling options.</li>
<li><a href="https://tsfresh.readthedocs.io/en/latest/?ref=sarem-seitz.com#">tsfresh</a>: This package only creates a large set of time-series summary statistics for you. Then, you can use those features as predictors in a custom forecasting model. Pretty flexible, but also more manual work.</li>
</ul>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Despite its popularity, Facebook Prophet contains some serious theoretical issues. These flaws can easily render its forecasts useless. On the one hand, Prophet makes building forecast models at scale more or less a breeze. This convenience, however, comes at the cost of a fair amount of unreliability.</p>
<p>To summarize all the above: As long as you expect your time-series to remain somewhat stable, Prophet can be a helpful plug-and-play solution. However, don’t get fooled by Prophet being right many times. Worst case, you ultimately go bust when it suddenly isn’t anymore.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/facebook-prophet-covid-and-why-i-dont-trust-the-prophet.html</guid>
  <pubDate>Mon, 08 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Probabilistic CUSUM for change point detection</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/probabilistic-cusum-for-change-point-detection.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>According to the famous principle of [Occam’s Razor], simpler models are more likely to be close to truth than complex ones. For change point detection problems - as in IoT or finance applications - arguably the simplest one is the Cumulative Sum (CUSUM) algorithm.</p>
<p>Despite its simplicity though, it can nevertheless be a powerful tool. In fact, CUSUM requires only a few loose assumptions on the underlying time-series. If these assumptions are met, it is possible to prove a plethora of helpful statistical properties.</p>
</section>
<section id="a-quick-look-at-cusum" class="level2">
<h2 class="anchored" data-anchor-id="a-quick-look-at-cusum">A quick look at CUSUM</h2>
<p>In summary, CUSUM detects shifts in the mean of a time-series that is stationary between two changepoints. Consider the following time-series:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/probabilistic-cusum-for-change-point-detection/changepoints.png" class="img-fluid figure-img" alt="Example per-segment stationary time-series (blue) with change points."></p>
<figcaption>Example per-segment stationary time-series (blue) with change points (straight green lines). CUSUM can handle such data.</figcaption>
</figure>
</div>
<p>This example is stationary between each pair of change points and thus a perfect use-case for our CUSUM algorithm. For change point detection on a non-stationary time-series like the next one, CUSUM will likely not work as intended:</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/probabilistic-cusum-for-change-point-detection/changepoint_nostat.png" class="img-fluid figure-img" alt="Example time-series with non-stationarity between two change points."></p>
<figcaption>Example time-series with non-stationarity between two change points. CUSUM won’t work properly with such data.</figcaption>
</figure>
</div>
<p>While CUSUM might still be able to detect shifts from a stationary to a non-stationary segment, there is no guarantee that is does so reliably anymore.</p>
<p>In general, the idea behind CUSUM can roughly be summarized as follows:</p>
<blockquote class="blockquote">
<p>If a time-series has constant zero mean, the cumulative sum of its realizations converges to a zero-mean Normal distribution (given some relatively loose technical assumptions). Thus, if the cumulative sum diverges from a zero-mean Normal distribution, a change-point in the underlying time-series might have occurred.</p>
</blockquote>
<p>We can derive this from one of the many <a href="https://en.wikipedia.org/wiki/Central_limit_theorem?ref=sarem-seitz.com">central limit theorems</a> (CLTs). While each CLT has some additional requirements (e.g.&nbsp;independent draws and finite variance for the <a href="https://en.wikipedia.org/wiki/Central_limit_theorem?ref=sarem-seitz.com#:~:text=set.%5B5%5D-,Lyapunov,-CLT%5Bedit">Lyapunov CLT</a>), chances are good that your particular time-series fulfils one of them.</p>
<p>In practice, we would estimate the mean of the current regime, subtract it from the time-series and calculate the cumulative sum. This only leaves the question of setting rule for when a change point has happened.</p>
<p>The standard CUSUM algorithm as in <a href="https://en.wikipedia.org/wiki/CUSUM?ref=sarem-seitz.com">Wikipedia</a> suggests to sum the z-standardized realizations of the time-series. A change point then occurs whenever this sum exceeds a pre-defined threshold. This whole procedure is therefore an ‘online’ algorithm, i.e.&nbsp;we can use it on a live data stream.</p>
</section>
<section id="some-problems-with-the-standard-cusum-algorithm" class="level2">
<h2 class="anchored" data-anchor-id="some-problems-with-the-standard-cusum-algorithm">Some problems with the standard CUSUM algorithm</h2>
<p>You might have already asked yourself how you should set the change point threshold values in CUSUM. After all, setting the threshold too loose will lead to undetected change points. On the other hand, narrow thresholds can easily lead to frequent false alarms.</p>
<p>Unfortunately, it is not easy to find clear instructions to solve this question. While a rule-of-thumb or experimenting with some setting might occasionally work, this is clearly not a reliable solution. Also, it is not feasible when we want to apply CUSUM on a large number of data streams.</p>
<p>Another issue concerns the level of anomaly that a given subsequence exhibits. Even if no change point happens, it might still be relevant to discover when a time-series is behaving unexpectedly.</p>
<p>Luckily, we can approach both challenges with a slight modification of the raw CUSUM algorithm.</p>
</section>
<section id="a-probabilistic-version-of-cusum" class="level2">
<h2 class="anchored" data-anchor-id="a-probabilistic-version-of-cusum">A probabilistic version of CUSUM</h2>
<p>At this point, we will finally need some equations. First, we define the standardized observations of an arbitrary subsequence of our time-series: <img src="https://latex.codecogs.com/png.latex?%0AZ_t=%5Cfrac%7BX_t-%5Chat%7B%5Cmu%7D_X%7D%7B%5Chat%7B%5Csigma%7D_X%7D%0A"> The hat-notation stresses that we can only ever work with estimates of the mean and standard deviation of our series. We can calculate these values, for example, by using the first N realizations for our estimates.</p>
<p>If we presume that the conditions of some CLT hold for our sequence, the following holds approximately and in the limit <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AS_T=%5Csum_%7Bt=1%7D%5ET%20Z_t%20%5Csim%20%5Cmathcal%7BN%7D(0,%20T)%20%5C%5C%0A%5CRightarrow%20%5Ctilde%7BS%7D_T=%5Cfrac%7B1%7D%7B%5Csqrt%7BT%7D%7D%20S_T%20%5Csim%20%5Cmathcal%7BN%7D(0,1)%0A%5Cend%7Bgathered%7D%0A"> By dividing the cumulative sum by the square root of the time-frame, we get a (<strong>theoretical</strong>) standard Normal distribution. Thus, as long as our CLT assumptions are valid, the following holds for the standardized, cumulative sum of the realized time-series: <img src="https://latex.codecogs.com/png.latex?%0A%5CPhi%5Cleft(%5Ctilde%7Bs%7D_T%5Cright)%20%5Capprox%20P%5Cleft(%5Ctilde%7BS%7D_T%20%5Cleq%20%5Ctilde%7Bs%7D_T%5Cright),%0A"> where <img src="https://latex.codecogs.com/png.latex?(%5CPhi(%5Ccdot)"> denotes the c.d.f. of a standard Normal distribution.</p>
<p>The resulting value can be interpreted as the probability of the theoretical cumulative sum being as small as the one we are observing. This is actually equivalent to the definition of <a href="https://en.wikipedia.org/wiki/P-value?ref=sarem-seitz.com">a p-value in classical hypothesis testing</a>.</p>
<p>Notice however that the above quantity currently only works in one direction, i.e.&nbsp;if the standardized sum is negative. In order to make this a two-sided statistic, we can ask for the probability of the standardized sum being at least as far away from the mean as our realized value. Since our sum is a scalar value, we can define ‘distance from zero’ simply as the absolute value and simplify: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AP%5Cleft(%5Cleft%7C%5Ctilde%7BS%7D_T%5Cright%7C%20%5Cgeq%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%20%5C%5C%0A=1-P%5Cleft(-%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%20%5Cleq%20%5Ctilde%7BS%7D_T%20%5Cleq%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%20%5C%5C%0A=1-%5Cleft%5BP%5Cleft(%5Ctilde%7BS%7D_T%20%5Cleq%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)-P%5Cleft(%5Ctilde%7BS%7D_T%20%5Cleq-%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%5Cright%5D%20%5C%5C%0A=1-%5Cleft%5BP%5Cleft(%5Ctilde%7BS%7D_T%20%5Cleq%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)-%5Cleft(1-P%5Cleft(%5Ctilde%7BS%7D_T%20%5Cleq%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%5Cright)%5Cright%5D%5C%5C%0A=2%5Cleft(1-P%5Cleft(%5Ctilde%7BS%7D_T%20%5Cleq%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%5Cright)%20%5C%5C%0A=2%5Cleft(1-%5CPhi%5Cleft(%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%5Cright)%0A%5Cend%7Bgathered%7D%0A"> We can now use this probability instead of the raw standardized CUSUM sum for change point detection. Contrary to the original sum, this measure has a clear, probabilistic interpretation. For each new datapoint we directly obtain a measure of how extreme the respective observation is.</p>
<p>Once a certain threshold of ‘unlikeliness’ is surpassed, we mark the respective timestamp as a change point and restart the algorithm.</p>
<p>Roughly, the algorithm looks as follows: 0) Define <img src="https://latex.codecogs.com/png.latex?p_%7B%5Ctext%20%7Blimit%20%7D%7D%20%5Cin(0,1),%20T_%7B%5Ctext%20%7Bwarmup%20%7D%7D%3E1"> 1) Collect observations <img src="https://latex.codecogs.com/png.latex?x_T"> while <img src="https://latex.codecogs.com/png.latex?T%3CT_%7B%5Ctext%20%7Bwarmup%20%7D%7D"> 2) If <img src="https://latex.codecogs.com/png.latex?T=T_%7B%5Ctext%7Bwarmup%7D%7D">, calculate <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7B%5Cmu%7D_X=%5Cfrac%7B1%7D%7BT%7D%20%5Csum_%7Bt=1%7D%5ET%20x_t,%20%5Chat%7B%5Csigma%7D_X=%5Csqrt%7B%5Cfrac%7B1%7D%7BT-1%7D%20%5Csum_%7Bt=1%7D%5ET%5Cleft(x_t-%5Chat%7B%5Cmu%7D_X%5Cright)%5E2%7D%0A"> 3) Calculate <img src="https://latex.codecogs.com/png.latex?%5Ctilde%7Bs%7D_T"> for <img src="https://latex.codecogs.com/png.latex?T%20%5Cin%5B1,%20T%5D"> and <img src="https://latex.codecogs.com/png.latex?p_T=2%5Cleft(1-%5CPhi%5Cleft(%5Cleft%7C%5Ctilde%7Bs%7D_T%5Cright%7C%5Cright)%5Cright)"> 4) If <img src="https://latex.codecogs.com/png.latex?p_T%3Ep_%7B%5Ctext%20%7Blimit%20%7D%7D">, detect change point and reset <img src="https://latex.codecogs.com/png.latex?T,%20%5Chat%7B%5Cmu%7D_X,%20%5Chat%7B%5Csigma%7D_X"></p>
<p>In Python, a possible implementation could look as follows. I used PyTorch to allow for potential future extensions with autograd functionality:</p>
<div id="cell-7" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> torch</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb1-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> typing <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> Tuple</span>
<span id="cb1-4"></span>
<span id="cb1-5"></span>
<span id="cb1-6"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> CusumMeanDetector():</span>
<span id="cb1-7">        </span>
<span id="cb1-8">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, t_warmup <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>, p_limit <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-9">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._t_warmup <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> t_warmup</span>
<span id="cb1-10">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._p_limit <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> p_limit</span>
<span id="cb1-11">        </span>
<span id="cb1-12">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._reset()</span>
<span id="cb1-13">        </span>
<span id="cb1-14">        </span>
<span id="cb1-15"></span>
<span id="cb1-16">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> predict_next(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y: torch.tensor) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> Tuple[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span>]:</span>
<span id="cb1-17">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._update_data(y)</span>
<span id="cb1-18"></span>
<span id="cb1-19">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._t_warmup:</span>
<span id="cb1-20">            <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._init_params()</span>
<span id="cb1-21">        </span>
<span id="cb1-22">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&gt;=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._t_warmup:</span>
<span id="cb1-23">            prob, is_changepoint <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._check_for_changepoint()</span>
<span id="cb1-24">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> is_changepoint:</span>
<span id="cb1-25">                <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._reset()</span>
<span id="cb1-26"></span>
<span id="cb1-27">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>prob), is_changepoint</span>
<span id="cb1-28">        </span>
<span id="cb1-29">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span>:</span>
<span id="cb1-30">            <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">False</span></span>
<span id="cb1-31">            </span>
<span id="cb1-32">    </span>
<span id="cb1-33">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _reset(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-34">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-35">                </span>
<span id="cb1-36">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_obs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb1-37">        </span>
<span id="cb1-38">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span></span>
<span id="cb1-39">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span></span>
<span id="cb1-40">            </span>
<span id="cb1-41">    </span>
<span id="cb1-42">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _update_data(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y: torch.tensor) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-43">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_t <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb1-44">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_obs.append(y.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb1-45"></span>
<span id="cb1-46">        </span>
<span id="cb1-47">    </span>
<span id="cb1-48">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _init_params(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">None</span>:</span>
<span id="cb1-49">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.mean(torch.concat(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_obs))</span>
<span id="cb1-50">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.std(torch.concat(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_obs))</span>
<span id="cb1-51">             </span>
<span id="cb1-52">    </span>
<span id="cb1-53">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _check_for_changepoint(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> Tuple[<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span>]:</span>
<span id="cb1-54">        standardized_sum <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">sum</span>(torch.concat(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_obs) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_mean)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.current_t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb1-55">        prob <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">float</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._get_prob(standardized_sum).detach().numpy())</span>
<span id="cb1-56">        </span>
<span id="cb1-57">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> prob, prob <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>._p_limit</span>
<span id="cb1-58">    </span>
<span id="cb1-59">    </span>
<span id="cb1-60">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> _get_prob(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, y: torch.tensor) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">bool</span>:</span>
<span id="cb1-61">        p <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.distributions.normal.Normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).cdf(torch.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">abs</span>(y))</span>
<span id="cb1-62">        prob <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> p)</span>
<span id="cb1-63">        </span>
<span id="cb1-64">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> prob</span></code></pre></div>
</div>
</section>
<section id="probabilistic-cusum-in-practice" class="level2">
<h2 class="anchored" data-anchor-id="probabilistic-cusum-in-practice">Probabilistic CUSUM in practice</h2>
<p>Let us try the above algorithm on two examples. First, we use the simulated, constant mean dataset from the introduction:</p>
<div id="cell-9" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb2-2"></span>
<span id="cb2-3">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">456</span>)</span>
<span id="cb2-4">torch.manual_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">456</span>)</span>
<span id="cb2-5"></span>
<span id="cb2-6">segment_lengths <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.random.randint(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">30</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>)]</span>
<span id="cb2-7"></span>
<span id="cb2-8"></span>
<span id="cb2-9">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.concat([torch.normal(torch.zeros(seg_len)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.random.uniform(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>),np.random.uniform()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> seg_len <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> segment_lengths])</span>
<span id="cb2-10"></span>
<span id="cb2-11"></span>
<span id="cb2-12">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CusumMeanDetector()</span>
<span id="cb2-13">outs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [test.predict_next(y[i]) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y))]</span>
<span id="cb2-14"></span>
<span id="cb2-15"></span>
<span id="cb2-16"></span>
<span id="cb2-17">cps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.where(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: x[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], outs)))[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb2-18">probs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: x[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], outs)))</span>
<span id="cb2-19"></span>
<span id="cb2-20">X, Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.meshgrid(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y)),np.linspace(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">11</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">11</span>))</span>
<span id="cb2-21">Z <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> probs[X]</span>
<span id="cb2-22"></span>
<span id="cb2-23"></span>
<span id="cb2-24">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">9</span>))</span>
<span id="cb2-25">plt.contourf(X,Y,Z,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Reds"</span>)</span>
<span id="cb2-26">plt.plot(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y)),y.detach().numpy(),lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Data"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb2-27"></span>
<span id="cb2-28">plt.axvline(np.cumsum(segment_lengths)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Actual changepoints"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-29">[plt.axvline(cp, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> cp <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> np.cumsum(segment_lengths)[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]]</span>
<span id="cb2-30"></span>
<span id="cb2-31">plt.axvline(cps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Detected changepoints"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-32">[plt.axvline(cp, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> cp <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> cps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]]</span>
<span id="cb2-33"></span>
<span id="cb2-34">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-35"></span>
<span id="cb2-36">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/probabilistic-cusum-for-change-point-detection_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Our modified version of CUSUM was able to detect all change points, albeit some delay in detection. However, all change points fell in regions where our probability metric already detected unusual behavior. Thus, with some fine-tuning, critical change points might have been detected even earlier.</p>
<p>For our second example, let us use an excerpt from the <a href="https://www.kaggle.com/datasets/yuriykatser/skoltech-anomaly-benchmark-skab?resource=download&amp;ref=sarem-seitz.com">Skoltech Anomaly Benchmark</a> dataset from Kaggle. I chose the time-series with the assumptions behind CUSUM in mind (in particular the constant mean assumption). Thus, the result should not serve as a reliable benchmark but rather as an illustrative example:</p>
<div id="cell-11" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb3-2"></span>
<span id="cb3-3">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.read_csv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/SKAB/other/11.csv"</span>,sep<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">";"</span>)</span>
<span id="cb3-4">df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"datetime"</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.to_datetime(df[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"datetime"</span>])</span>
<span id="cb3-5">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.sort_values(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"datetime"</span>)</span>
<span id="cb3-6"></span>
<span id="cb3-7">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> torch.tensor(df.iloc[:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>].values)</span>
<span id="cb3-8"></span>
<span id="cb3-9">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CusumMeanDetector()</span>
<span id="cb3-10">outs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [test.predict_next(y[i]) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y))]</span>
<span id="cb3-11"></span>
<span id="cb3-12"></span>
<span id="cb3-13"></span>
<span id="cb3-14">cps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.where(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: x[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>], outs)))[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb3-15">probs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.array(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: x[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], outs)))</span>
<span id="cb3-16"></span>
<span id="cb3-17">X, Y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.meshgrid(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y)),np.linspace(torch.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">min</span>(y).detach().numpy(),torch.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">max</span>(y).detach().numpy()))</span>
<span id="cb3-18">Z <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> probs[X]</span>
<span id="cb3-19"></span>
<span id="cb3-20"></span>
<span id="cb3-21">plt.figure(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">18</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">9</span>))</span>
<span id="cb3-22">plt.contourf(X,Y,Z,alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.3</span>,cmap<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Reds"</span>)</span>
<span id="cb3-23">plt.plot(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y)),y.detach().numpy(),lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Data"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"blue"</span>)</span>
<span id="cb3-24"></span>
<span id="cb3-25"></span>
<span id="cb3-26">plt.axvline(cps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>], color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Detected changepoints"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb3-27">[plt.axvline(cp, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dashed"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> cp <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> cps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:]]</span>
<span id="cb3-28"></span>
<span id="cb3-29">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>, linestyle<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"dotted"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb3-30"></span>
<span id="cb3-31">plt.legend()</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/probabilistic-cusum-for-change-point-detection_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>While our CUSUM variant had some problems with linear trend patterns, the overall result looks reasonable. This also demonstrates the limitations of this algorithm, once the constant mean assumption is violated. Nevertheless, despite its simplicity, CUSUM appears to be a useful choice.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>Although CUSUM is a very simple algorithm, it can be quite powerful as long as the underlying assumptions are met. With a simple, probabilistic modification we can easily improve the standard version of CUSUM and make it more expressive and intuitive.</p>
<p>For more complex problems though, more sophisticated algorithms are likely necessary. One particularly useful algorithm is <a href="https://gregorygundersen.com/blog/2019/08/13/bocd/?ref=sarem-seitz.com">Bayesian Online Changepoint Detection</a> which I can hopefully cover in the future.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <category>Change Point Detection</category>
  <guid>https://www.sarem-seitz.com/posts/probabilistic-cusum-for-change-point-detection.html</guid>
  <pubDate>Thu, 04 Aug 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Multivariate, probabilistic time-series forecasting with LSTM and Gaussian Copula</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/multivariate-probabilistic-time-series-forecasting-with-lstm-and-gaussian-copula.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>As commonly known, LSTMs (<a href="https://de.wikipedia.org/wiki/Long_short-term_memory?ref=sarem-seitz.com">Long short-term memory networks</a>) are great for dealing with sequential data. One such example are multivariate time-series data. Here, LSTMs can model conditional distributions for complex forecasting problems.</p>
<p>For example, consider the following conditional forecasting distribution: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap%5Cleft(y_%7Bt+1%7D%20%5Cmid%20y_t%5Cright)=%5Cmathcal%7BN%7D%5Cleft(y_%7Bt+1%7D%20%5Cmid%20%5Cmu_%5Ctheta%5Cleft(y_t%5Cright),%20%5CSigma_%5Ctheta%5Cleft(y_t%5Cright)%5Cright)%20%5C%5C%0A%5Cmu_%5Ctheta%5Cleft(y_t%5Cright)=f%5Cleft(y_t,%20h_t%5Cright)_%5Cmu%20%5C%5C%0A%5CSigma_%5Ctheta%5Cleft(y_t%5Cright)=L_%5Ctheta%5Cleft(y_t%5Cright)%20L_%5Ctheta%5Cleft(y_t%5Cright)%20%5C%5C%0AL_%5Ctheta%5Cleft(y_t%5Cright)=f%5Cleft(y_t,%20h_t%5Cright)_L%0A%5Cend%7Bgathered%7D%0A"> - <img src="https://latex.codecogs.com/png.latex?f%5Cleft(y_t,%20h_t%5Cright)_%5Cmu:="> LSTM mean output given hidden state <img src="https://latex.codecogs.com/png.latex?h_t"> - <img src="https://latex.codecogs.com/png.latex?f%5Cleft(y_t,%20h_t%5Cright)_L:="> LSTM covariance cholesky output given hidden state <img src="https://latex.codecogs.com/png.latex?h_t"></p>
<p>Notice that we predict the cholesky decomposition of the conditional covariance matrix. This ensures that the resulting covariance matrix is positive semi-definite. Now, this approach would allow us to model quite complex dynamical problems.</p>
<p>On the other hand, however, the degrees of freedom in this model will rapidly explode with increasing dimensionality D of the multivariate time-series. After all, we need (D^2+D)/2 LSTM outputs for the covariance structure alone. This can clearly lead to overfitting quite easily.</p>
<p>Another disadvantage is the assumption of a conditionally Gaussian time-series. As soon as our time-series is not a vector of real-numbers, this model does not work anymore.</p>
<p>Thus, a potential solution should satisfy two properties:</p>
<ol type="1">
<li>Allow to <strong>parsimoniously</strong> handle high-dimensional time-series</li>
<li>Work with conditionally <strong>non-Gaussian</strong> time-series</li>
</ol>
</section>
<section id="lstms-with-gaussian-copula" class="level2">
<h2 class="anchored" data-anchor-id="lstms-with-gaussian-copula">LSTMs with Gaussian Copula</h2>
<p>As a potential solution, we could separate the dependency among the time-series from their marginal distribution. Hence, let us presume constant conditional dependency between the time-series but varying conditional marginals. This indicates that a Copula model might be a good approach - for simplicity, we use a Gaussian Copula.</p>
<p>Since the basics of the Gaussian Copula have been discussed in <a href="https://sarem-seitz.com/blog/arma-forecasting-for-non-gaussian-time-series-data-using-copulas/?ref=sarem-seitz.com">this previous article</a>, we won’t repeat them here.</p>
<p>In summary, our model looks as follows: <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(y_%7Bt+1%7D%20%5Cmid%20y_t%5Cright)=%5Cprod_%7Bd=1%7D%5ED%20p_d%5Cleft(y_%7Bt+1%7D%5E%7B(d)%7D%20%5Cmid%20%5Ctheta_d%5Cleft(y_t%5Cright)%5Cright)%20%5Ccdot%20c%5Cleft(F%5Cleft(y_%7Bt+1%7D%5E%7B(1)%7D%20%5Cmid%20%5Ctheta_1%5Cleft(y_t%5Cright)%5Cright),%20%5Cldots,%20F%5Cleft(y_%7Bt+1%7D%5E%7B(D)%7D%20%5Cmid%20%5Ctheta_D%5Cleft(y_t%5Cright)%5Cright)%20;%20R%5Cright)%0A"> - <img src="https://latex.codecogs.com/png.latex?p_d:=%5Cmathrm%7Bd%7D">-th marginal forecast density of <img src="https://latex.codecogs.com/png.latex?%5Cmathrm%7Bd%7D">-th time-series - <img src="https://latex.codecogs.com/png.latex?y_%7Bt+1%7D%5E%7B(d)%7D"> - <img src="https://latex.codecogs.com/png.latex?%5Ctheta_d%5Cleft(y_t%5Cright)=f%5Cleft(y_t%20;%20h_t%5Cright)_%7B%5Ctheta_d%7D:=%5Cmathrm%7Bd%7D">-th conditional parameter vector modelled as the output of the LSTM - <img src="https://latex.codecogs.com/png.latex?c(%5Ccdot,%20%5Cldots,%20%5Ccdot%20;%20R):="> Gaussian Copula density with dependency parameter matrix <img src="https://latex.codecogs.com/png.latex?R"> - <img src="https://latex.codecogs.com/png.latex?F%5Cleft(y_%7Bt+1%7D%5E%7B(d)%7D%20%5Cmid%20%5Ctheta_d%5Cleft(y_t%5Cright)%5Cright):="> d-th marginal forecast c.d.f. This allows us to deal with arbitrary continuous marginal distributions. In fact, we could even work with mixed continuous marginal distributions. In order to achieve sparsity in the copula parameter matrix, we could, for example, add a regularization term as is <a href="https://scikit-learn.org/stable/modules/covariance.html?ref=sarem-seitz.com#shrunk-covariance">typically done</a> when estimating high-dimensional covariance matrices.</p>
<p>The only drawback now is the assumption of a constant dependency over time. If this contradicts the data at hand, we might need to model the copula parameter in an auto-regressive manner as well. A low-rank matrix approach could preserve some parsimony then.</p>
<p>To show how this could be implemented in the case of Gaussian marginals, I have created a quick Jupyter notebook with tensorflow. Regarding the Copula part, the <a href="https://www.tensorflow.org/probability/examples/Gaussian_Copula?ref=sarem-seitz.com">tensorflow example on Gaussian Copulas</a> has a ready-made implementation using <a href="https://www.tensorflow.org/probability/api_docs/python/tfp/bijectors/Bijector?ref=sarem-seitz.com">tensorflow probability bijectors</a>.</p>
</section>
<section id="implementation-sketch-not-too-much-explanations-from-here-on" class="level2">
<h2 class="anchored" data-anchor-id="implementation-sketch-not-too-much-explanations-from-here-on">Implementation (sketch, not too much explanations from here on)</h2>
<p>Data taken from https://www.kaggle.com/datasets/vagifa/usa-commodity-prices. We will use only culinary oil price, presuming that there is some underlying correlation among them.</p>
<div id="cell-5" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> pandas <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> pd</span>
<span id="cb1-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> matplotlib.pyplot <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> plt</span>
<span id="cb1-3"></span>
<span id="cb1-4">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.read_csv(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/commodity-prices-2016.csv"</span>)</span>
<span id="cb1-5">df <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df.set_index(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Date"</span>)</span>
<span id="cb1-6">df.index <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.to_datetime(df.index)</span>
<span id="cb1-7"></span>
<span id="cb1-8">oils <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> df[[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Olive Oil"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Palm oil"</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Soybean Oil"</span>]]</span>
<span id="cb1-9"></span>
<span id="cb1-10">oils.plot(figsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">8</span>))</span>
<span id="cb1-11">plt.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/multivariate-probabilistic-time-series-forecasting-with-lstm-and-gaussian-copula_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<section id="use-log-difference-prices-for-standardization" class="level4">
<h4 class="anchored" data-anchor-id="use-log-difference-prices-for-standardization">Use log-difference prices for standardization</h4>
<p>(i.e.&nbsp;‘log-returns’)</p>
<div id="cell-7" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> numpy <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> np</span>
<span id="cb2-2"></span>
<span id="cb2-3">oils_ld <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> np.log(oils).diff().iloc[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:,:]</span>
<span id="cb2-4"></span>
<span id="cb2-5">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>))</span>
<span id="cb2-6">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>].plot(oils_ld[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Olive Oil"</span>],label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Olive Oil"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C0"</span>)</span>
<span id="cb2-7">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>].plot(oils_ld[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Palm oil"</span>], label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Palm oil"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C1"</span>)</span>
<span id="cb2-8">ax[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>].plot(oils_ld[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Soybean Oil"</span>], label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Soybean Oil"</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C2"</span>)</span>
<span id="cb2-9">[a.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> a <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> ax]</span>
<span id="cb2-10">[a.legend() <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> a <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> ax]</span></code></pre></div>
<div class="cell-output cell-output-display">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/multivariate-probabilistic-time-series-forecasting-with-lstm-and-gaussian-copula_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="create-lag-1-features-as-input" class="level4">
<h4 class="anchored" data-anchor-id="create-lag-1-features-as-input">Create lag-1 features as input</h4>
<p>We might want to go to higher lags for increased accuracy.</p>
<div id="cell-9" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1">oils_lagged <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> pd.concat([oils_ld.shift(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),oils_ld],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).iloc[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:,:]</span>
<span id="cb3-2">oils_lagged.columns <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [c<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"_l1"</span> <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> c <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> oils_ld.columns] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(oils_ld.columns)</span>
<span id="cb3-3">oils_lagged</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>/var/folders/2d/hl2cr85d2pb2kfbmsng3267c0000gn/T/ipykernel_61976/3710246824.py:1: FutureWarning: In a future version of pandas all arguments of concat except for the argument 'objs' will be keyword-only.
  oils_lagged = pd.concat([oils_ld.shift(1),oils_ld],1).iloc[1:,:]</code></pre>
</div>
<div class="cell-output cell-output-display" data-execution_count="3">
<div>


<table class="dataframe caption-top table table-sm table-striped small" data-quarto-postprocess="true" data-border="1">
<thead>
<tr class="header">
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th">Olive Oil_l1</th>
<th data-quarto-table-cell-role="th">Palm oil_l1</th>
<th data-quarto-table-cell-role="th">Soybean Oil_l1</th>
<th data-quarto-table-cell-role="th">Olive Oil</th>
<th data-quarto-table-cell-role="th">Palm oil</th>
<th data-quarto-table-cell-role="th">Soybean Oil</th>
</tr>
<tr class="odd">
<th data-quarto-table-cell-role="th">Date</th>
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th"></th>
<th data-quarto-table-cell-role="th"></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td data-quarto-table-cell-role="th">1980-03-01</td>
<td>-0.006731</td>
<td>0.014993</td>
<td>-0.013089</td>
<td>-0.030768</td>
<td>-0.069312</td>
<td>-0.063604</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">1980-04-01</td>
<td>-0.030768</td>
<td>-0.069312</td>
<td>-0.063604</td>
<td>-0.050111</td>
<td>-0.025850</td>
<td>-0.076200</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">1980-05-01</td>
<td>-0.050111</td>
<td>-0.025850</td>
<td>-0.076200</td>
<td>-0.017756</td>
<td>-0.045196</td>
<td>0.026527</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">1980-06-01</td>
<td>-0.017756</td>
<td>-0.045196</td>
<td>0.026527</td>
<td>0.004272</td>
<td>-0.050933</td>
<td>0.046044</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">1980-07-01</td>
<td>0.004272</td>
<td>-0.050933</td>
<td>0.046044</td>
<td>0.009627</td>
<td>-0.018182</td>
<td>0.189117</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">...</td>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
<td>...</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">2015-10-01</td>
<td>-0.036858</td>
<td>-0.002459</td>
<td>-0.063183</td>
<td>-0.092682</td>
<td>0.092317</td>
<td>0.055295</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">2015-11-01</td>
<td>-0.092682</td>
<td>0.092317</td>
<td>0.055295</td>
<td>-0.114045</td>
<td>-0.052427</td>
<td>-0.014648</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">2015-12-01</td>
<td>-0.114045</td>
<td>-0.052427</td>
<td>-0.014648</td>
<td>-0.096288</td>
<td>0.034072</td>
<td>0.096772</td>
</tr>
<tr class="even">
<td data-quarto-table-cell-role="th">2016-01-01</td>
<td>-0.096288</td>
<td>0.034072</td>
<td>0.096772</td>
<td>0.047791</td>
<td>0.020941</td>
<td>-0.025876</td>
</tr>
<tr class="odd">
<td data-quarto-table-cell-role="th">2016-02-01</td>
<td>0.047791</td>
<td>0.020941</td>
<td>-0.025876</td>
<td>0.033650</td>
<td>0.114146</td>
<td>0.040106</td>
</tr>
</tbody>
</table>

<p>432 rows × 6 columns</p>
</div>
</div>
</div>
<div id="cell-10" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">X_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> oils_lagged.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>,:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>].values</span>
<span id="cb5-2">y_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> oils_lagged.iloc[:<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>:].values</span>
<span id="cb5-3">X_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> oils_lagged.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:,:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>].values</span>
<span id="cb5-4">y_test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> oils_lagged.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>:,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>:].values</span></code></pre></div>
</div>
</section>
<section id="create-the-model-httpswww.tensorflow.orgprobabilityexamplesgaussian_copulatextgaussian-copula-to20illustrate20how" class="level4">
<h4 class="anchored" data-anchor-id="create-the-model-httpswww.tensorflow.orgprobabilityexamplesgaussian_copulatextgaussian-copula-to20illustrate20how">Create the model (https://www.tensorflow.org/probability/examples/Gaussian_Copula#:~:text=Gaussian-,Copula,-To%20illustrate%20how)</h4>
<div id="cell-12" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> tensorflow <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> tf</span>
<span id="cb6-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> tensorflow <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> keras</span>
<span id="cb6-3"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">from</span> tensorflow.keras <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> layers</span>
<span id="cb6-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">import</span> tensorflow_probability <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> tfp</span>
<span id="cb6-5">tfd <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfp.distributions</span>
<span id="cb6-6">tfb <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfp.bijectors</span>
<span id="cb6-7"></span>
<span id="cb6-8"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> GaussianCopulaTriL(tfd.TransformedDistribution):</span>
<span id="cb6-9">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, loc, scale_tril):</span>
<span id="cb6-10">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>(GaussianCopulaTriL, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>).<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(</span>
<span id="cb6-11">            distribution<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>tfd.MultivariateNormalTriL(</span>
<span id="cb6-12">                loc<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>loc,</span>
<span id="cb6-13">                scale_tril<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>scale_tril),</span>
<span id="cb6-14">            bijector<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>tfb.NormalCDF(),</span>
<span id="cb6-15">            validate_args<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,</span>
<span id="cb6-16">            name<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"GaussianCopulaTriLUniform"</span>)</span>
<span id="cb6-17"></span>
<span id="cb6-18">        </span>
<span id="cb6-19"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">class</span> CopulaLSTMModel(tf.keras.Model):</span>
<span id="cb6-20">    </span>
<span id="cb6-21">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, input_dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>, output_dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>):</span>
<span id="cb6-22">        <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">super</span>().<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">__init__</span>()</span>
<span id="cb6-23">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.input_dims <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> input_dims</span>
<span id="cb6-24">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> output_dims</span>
<span id="cb6-25">        </span>
<span id="cb6-26">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#use LSTM state to ease training and testing state transition</span></span>
<span id="cb6-27">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.c0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.ones([<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,input_dims]), trainable <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>) </span>
<span id="cb6-28">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.h0 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.ones([<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,input_dims]), trainable <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb6-29">        </span>
<span id="cb6-30">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.lstm <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> layers.LSTM(input_dims, </span>
<span id="cb6-31">                                         batch_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,input_dims),</span>
<span id="cb6-32">                                         return_sequences<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>,</span>
<span id="cb6-33">                                         return_state<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span></span>
<span id="cb6-34">                                        )</span>
<span id="cb6-35">        </span>
<span id="cb6-36">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mean_layer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> layers.Dense(output_dims)</span>
<span id="cb6-37">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.std_layer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> layers.Dense(output_dims,activation<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>tf.nn.softplus)</span>
<span id="cb6-38">        </span>
<span id="cb6-39">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.chol <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.Variable(tf.random.normal((output_dims,output_dims)), trainable <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">True</span>)</span>
<span id="cb6-40">    </span>
<span id="cb6-41">    </span>
<span id="cb6-42">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> call(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, inputs):</span>
<span id="cb6-43">        lstm_out <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.lstm(inputs, initial_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.c0, <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.h0])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>]</span>
<span id="cb6-44">        means <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mean_layer(lstm_out)</span>
<span id="cb6-45">        stds <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.std_layer(lstm_out)</span>
<span id="cb6-46">        </span>
<span id="cb6-47">        distributions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfd.Normal(means, stds)</span>
<span id="cb6-48">        </span>
<span id="cb6-49">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> distributions</span>
<span id="cb6-50">    </span>
<span id="cb6-51">    </span>
<span id="cb6-52">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> call_with_state(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, inputs, c_state, h_state):</span>
<span id="cb6-53">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#explicitly use and return the initial state - primarily for forecasting</span></span>
<span id="cb6-54">        lstm_out, c_out, h_out <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.lstm(inputs, initial_state <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [c_state, h_state])</span>
<span id="cb6-55">        </span>
<span id="cb6-56">        </span>
<span id="cb6-57">        means <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mean_layer(lstm_out)</span>
<span id="cb6-58">        stds <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.std_layer(lstm_out)</span>
<span id="cb6-59">        </span>
<span id="cb6-60">        distributions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfd.Normal(means, stds)</span>
<span id="cb6-61">        </span>
<span id="cb6-62">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> distributions, c_out, h_out</span>
<span id="cb6-63">        </span>
<span id="cb6-64">    </span>
<span id="cb6-65">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> get_normalized_covariance(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>):</span>
<span id="cb6-66">        unnormalized_covariance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.chol<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>tf.transpose(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.chol)</span>
<span id="cb6-67">        normalizer <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.eye(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>(tf.linalg.tensor_diag_part(unnormalized_covariance)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">**</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb6-68">        </span>
<span id="cb6-69">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> normalizer<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>unnormalized_covariance<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>normalizer</span>
<span id="cb6-70">        </span>
<span id="cb6-71">        </span>
<span id="cb6-72">    </span>
<span id="cb6-73">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> conditional_log_prob(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, inputs, targets):</span>
<span id="cb6-74">        marginals <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.call(inputs)</span>
<span id="cb6-75">        marginal_lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.reshape(marginals.log_prob(targets),(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims))</span>
<span id="cb6-76">        </span>
<span id="cb6-77">        copula_transformed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> marginals.cdf(y_train)</span>
<span id="cb6-78">        normalized_covariance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.get_normalized_covariance()  <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#need covariance matrix with unit diagonal for Gaussian Copula</span></span>
<span id="cb6-79">        copula_dist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> GaussianCopulaTriL(loc<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>tf.zeros(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims),scale_tril <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.linalg.cholesky(normalized_covariance))</span>
<span id="cb6-80">        </span>
<span id="cb6-81">        copula_lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> copula_dist.log_prob(copula_transformed)</span>
<span id="cb6-82">        </span>
<span id="cb6-83">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> tf.reduce_mean(tf.math.reduce_sum(marginal_lpdfs,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> copula_lpdfs)</span>
<span id="cb6-84">        </span>
<span id="cb6-85">    </span>
<span id="cb6-86">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> train_step(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, data):</span>
<span id="cb6-87">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#custom training steps due to custom loglikelihood-loss</span></span>
<span id="cb6-88">        x, y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> data</span>
<span id="cb6-89">        </span>
<span id="cb6-90">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">with</span> tf.GradientTape() <span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">as</span> tape:</span>
<span id="cb6-91">            loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.conditional_log_prob(x,y)</span>
<span id="cb6-92">            </span>
<span id="cb6-93">        trainable_vars <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.trainable_weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.lstm.trainable_weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.mean_layer.trainable_weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.std_layer.trainable_weights</span>
<span id="cb6-94">        gradients <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tape.gradient(loss, trainable_vars)</span>
<span id="cb6-95">        </span>
<span id="cb6-96">        <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.optimizer.apply_gradients(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">zip</span>(gradients, trainable_vars))</span>
<span id="cb6-97">        </span>
<span id="cb6-98">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> {<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Current loss"</span>: loss}</span>
<span id="cb6-99">    </span>
<span id="cb6-100">    </span>
<span id="cb6-101">    </span>
<span id="cb6-102">    <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">def</span> sample_forecast(<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>, X_train, y_train, forecast_periods <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">12</span>):</span>
<span id="cb6-103">        <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#this is still quite slow; should be optimized if used for a real-world problem</span></span>
<span id="cb6-104">        </span>
<span id="cb6-105">        normalized_covariance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.get_normalized_covariance()</span>
<span id="cb6-106">        copula_dist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfp.distributions.MultivariateNormalTriL(scale_tril <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tf.linalg.cholesky(normalized_covariance))</span>
<span id="cb6-107">        copula_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> tfp.distributions.Normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>).cdf(copula_dist.sample(forecast_periods))</span>
<span id="cb6-108">        </span>
<span id="cb6-109">        sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb6-110">        </span>
<span id="cb6-111">        input_current <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> y_train</span>
<span id="cb6-112">        </span>
<span id="cb6-113">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(forecast_periods):</span>
<span id="cb6-114">        </span>
<span id="cb6-115">            _, c_current, h_current <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.lstm(X_train)</span>
<span id="cb6-116"></span>
<span id="cb6-117">            new_dist, c_current, h_current <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.call_with_state(tf.reshape(input_current,(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.input_dims)), c_current, h_current)</span>
<span id="cb6-118"></span>
<span id="cb6-119">            input_current <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> new_dist.quantile(tf.reshape(copula_sample[t,:],(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims)))</span>
<span id="cb6-120">            sample.append(tf.reshape(input_current,(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims)).numpy().reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="va" style="color: #111111;
background-color: null;
font-style: inherit;">self</span>.output_dims))</span>
<span id="cb6-121">        </span>
<span id="cb6-122">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> np.concatenate(sample)</span></code></pre></div>
</div>
</section>
<section id="train-the-model" class="level4">
<h4 class="anchored" data-anchor-id="train-the-model">Train the model</h4>
<div id="cell-14" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb7-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb7-2">tf.random.set_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb7-3"></span>
<span id="cb7-4">test <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> CopulaLSTMModel()</span>
<span id="cb7-5"></span>
<span id="cb7-6">test.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">compile</span>(optimizer<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"adam"</span>)</span>
<span id="cb7-7"></span>
<span id="cb7-8">test.fit(X_train.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>),y_train.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>), epochs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>, verbose<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#relatively fast</span></span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="6">
<pre><code>&lt;keras.callbacks.History at 0x1469271c0&gt;</code></pre>
</div>
</div>
</section>
<section id="raw-log-diff-predictions" class="level4">
<h4 class="anchored" data-anchor-id="raw-log-diff-predictions">Raw log-diff predictions</h4>
<div id="cell-16" class="cell">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb9-1">np.random.seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb9-2">tf.random.set_seed(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb9-3"></span>
<span id="cb9-4">samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [test.sample_forecast(X_train.reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>),y_train[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,:].reshape(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>)] <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#very slow, grab a coffee or two</span></span>
<span id="cb9-5"></span>
<span id="cb9-6">samples_restructured <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.concatenate(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">list</span>(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">lambda</span> x: x[:,i].reshape(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),samples)),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb9-7">means <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.mean(s,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples_restructured]</span>
<span id="cb9-8">lowers <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.quantile(s,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples_restructured]</span>
<span id="cb9-9">uppers <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.quantile(s,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples_restructured]</span>
<span id="cb9-10"></span>
<span id="cb9-11">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>))</span>
<span id="cb9-12">[ax[i].plot(y_test[:,i],label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>oils.columns[i],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(i)) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb9-13">[ax[i].plot(means[i], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Mean forecast"</span>, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb9-14">[ax[i].fill_between(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(y_test)),lowers[i],uppers[i], label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Forecast interval"</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb9-15"></span>
<span id="cb9-16">[a.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> a <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> ax]</span></code></pre></div>
</div>
</section>
<section id="actual-price-predictions" class="level4">
<h4 class="anchored" data-anchor-id="actual-price-predictions">Actual price predictions</h4>
<div id="cell-18" class="cell">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1">samples_retrans <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.exp(np.log(oils.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,i<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>np.cumsum(samples_restructured[i],<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>)) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb10-2">means_retrans <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.mean(s,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples_retrans]</span>
<span id="cb10-3">lowers_retrans <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.quantile(s,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples_retrans]</span>
<span id="cb10-4">uppers_retrans <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [np.quantile(s,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.975</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> samples_retrans]</span>
<span id="cb10-5"></span>
<span id="cb10-6">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>))</span>
<span id="cb10-7">[ax[i].plot(oils.values[:,i],label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>oils.columns[i],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(i)) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb10-8">[ax[i].set_xlim((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils))) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb10-9">[ax[i].plot(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils)),np.concatenate([[oils.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,i]],means_retrans[i]]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Mean forecast"</span>, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb10-10">[ax[i].fill_between(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils)),np.concatenate([[oils.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,i]],lowers_retrans[i]]),np.concatenate([[oils.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,i]],uppers_retrans[i]]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Forecast interval"</span>, color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.2</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb10-11"></span>
<span id="cb10-12">[a.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> a <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> ax]</span></code></pre></div>
</div>
</section>
<section id="sample-trajectories" class="level4">
<h4 class="anchored" data-anchor-id="sample-trajectories">Sample trajectories</h4>
<div id="cell-20" class="cell">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb11-1">fig, ax <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plt.subplots(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>, figsize <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">16</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>))</span>
<span id="cb11-2"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> s <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>):</span>
<span id="cb11-3">    [ax[i].plot(np.arange(<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils)),np.concatenate([[oils.iloc[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">13</span>,i<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>]],samples_retrans[i][:,s]]), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Mean forecast"</span>, color <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"purple"</span>, lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb11-4">[ax[i].plot(oils.values[:,i],label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>oils.columns[i],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"C</span><span class="sc" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">{}</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"</span>.<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">format</span>(i), lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb11-5">[ax[i].set_xlim((<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">375</span>,<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">len</span>(oils))) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">range</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb11-6"></span>
<span id="cb11-7">[a.grid(alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> a <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> ax]</span></code></pre></div>
</div>
</section>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>This was just a rough collection of ideas on how a Copula-LSTM time-series model could look like. Feel free to contact me for more information.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Hochreiter, Sepp; Schmidhuber, Jürgen. Long short-term memory. Neural computation, 1997, 9.8, p.&nbsp;1735-1780.</p>
<p><strong>[2]</strong> Nelsen, Roger B. An introduction to copulas. Springer Science &amp; Business Media, 2007.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <category>Neural Networks</category>
  <guid>https://www.sarem-seitz.com/posts/multivariate-probabilistic-time-series-forecasting-with-lstm-and-gaussian-copula.html</guid>
  <pubDate>Thu, 30 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Let’s make GARCH more flexible with Normalizing Flows</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>For financial time-series data, GARCH (<a href="https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity?ref=sarem-seitz.com#GARCH(p,_q)_model_specification:~:text=the%20null%20hypothesis.-,GARCH,-%5Bedit%5D">Generalized AutoRegressive Conditional Heteroscedasticity</a>) models play an important role. While forecasting mean returns is usually futile, stock volatility appears to be predictable, at least to some extent. However, standard GARCH relies on the potentially limiting assumption of conditional Gaussian data.</p>
<p><a href="https://sarem-seitz.com/blog/arma-forecasting-for-non-gaussian-time-series-data-using-copulas/?ref=sarem-seitz.com">Just like last time</a>, we could use a Copula approach to remove such Gaussianity assumptions. Given that stock time-series typically have a lot of observations, a more flexible approach might be superior. In fact, it would be great if our model could simply infer the conditional distribution from the data provided.</p>
<p>A popular approach in Machine Learning to such problems are <a href="https://arxiv.org/abs/1908.09257?ref=sarem-seitz.com">Normalizing Flows</a>. In summary, Normalizing Flows allow to transform a known base distribution to a complex one in a differentiable manner. Let us briefly look at the technicalities:</p>
</section>
<section id="a-short-intro-to-normalizing-flows" class="level2">
<h2 class="anchored" data-anchor-id="a-short-intro-to-normalizing-flows">A short intro to Normalizing Flows</h2>
<p>When choosing a probability model, we typically see a trade-off between <strong>flexibility and tractability</strong>. While the Gaussian distribution is nice to work with, real world data is obviously much more complex on most occasions.</p>
<p>On the other extreme, we see modern generative models like <a href="https://en.wikipedia.org/wiki/Generative_adversarial_network?ref=sarem-seitz.com">GANs</a> that can produce complex data at the cost of an intractable distribution. This requires the use of sampling estimators for parameter estimation which can be quite inefficient.</p>
<p>Somewhere in the middle, we have Normalizing Flows. One the one hand, the expressiveness of Normalizing Flows might still be too limited for advanced image generation. Nonetheless, they are likely sufficient as a replacement for a mere Normal distribution.</p>
<section id="the-core-principle-behind-normalizing-flows" class="level3">
<h3 class="anchored" data-anchor-id="the-core-principle-behind-normalizing-flows">The core principle behind Normalizing Flows</h3>
<p>At the heart of Normalizing Flows, we have the change-of-variables formula. The latter tells us how the density of a random variable changes under monotone transformations. As we are only interested in transformations of univariate variables, we will focus on the respective variant:</p>
<blockquote class="blockquote">
<p>Let <img src="https://latex.codecogs.com/png.latex?p_X(%5Ccdot)"> denote the probability density of a (univariate) random variable <img src="https://latex.codecogs.com/png.latex?X">. Let <img src="https://latex.codecogs.com/png.latex?g:%20%5Cmathbb%7BR%7D%20%5Cmapsto%20%5Cmathbb%7BR%7D"> denote a strictly monotonic transformation. For the probability density <img src="https://latex.codecogs.com/png.latex?p_Y(%5Ccdot)"> of <img src="https://latex.codecogs.com/png.latex?%5Cmathrm%7BY%7D=%5Cmathrm%7Bg%7D(%5Cmathrm%7BX%7D)"> we have : <img src="https://latex.codecogs.com/png.latex?%0Ap_Y(y)=p_X%5Cleft(g%5E%7B-1%7D(y)%5Cright)%5Cleft%7C%5Cfrac%7Bd%7D%7Bd%20y%7D%20g%5E%7B-1%7D(y)%5Cright%7C%0A"></p>
</blockquote>
<p>In summary, this formula allows us to generate new random variables from known ones. At the same time, we can calculate the probability density of the derived random variable in closed form. Of course, the limiting factor here is the restriction of <img src="https://latex.codecogs.com/png.latex?g"> to strictly monotonic functions. As we will see, however, this still leaves plenty of room for reasonably flexible distributions.</p>
<p>In Normalizing Flows, we now make the following crucial observation:</p>
<blockquote class="blockquote">
<p><em>Chaining strictly monotonic functions results in another, more complex strictly monotonic function.</em></p>
</blockquote>
<p>Put into an equation, this looks as follows:</p>
<blockquote class="blockquote">
<p>Let <img src="https://latex.codecogs.com/png.latex?g_1,%20%5Cldots,%20g_M"> be strictly monotonic. It then follows that <img src="https://latex.codecogs.com/png.latex?%0A%5Ctilde%7Bg%7D=g_M%20%5Ccirc%20%5Ccdots%20%5Ccirc%20g_1%0A"> is also strictly monotonic.</p>
</blockquote>
<p>If we define the outcome variable after the m-th transformation as <img src="https://latex.codecogs.com/png.latex?%0AX_m=g_m%20%5Ccirc%20%5Ccdots%20%5Ccirc%20g_1%5Cleft(X_0%5Cright)%0A"> we can derive the (log-) density for the resulting variable after the M-th transformation by applying the chain rule: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap_%7BX_M%7D%5Cleft(x_m%5Cright)=p_%7BX_0%7D%5Cleft(%5Ctilde%7Bg%7D%5E%7B-1%7D%5Cleft(x_M%5Cright)%5Cright)%5Cleft%7C%5Cfrac%7Bd%7D%7Bd%20x_M%7D%20%5Ctilde%7Bg%7D%5E%7B-1%7D%5Cleft(x_M%5Cright)%5Cright%7C%20%5C%5C%0A=p_%7BX_0%7D%5Cleft(x_0%5Cright)%5Cleft%7C%5Cprod_%7Bm=1%7D%5EM%5Cleft(g_m%5E%7B-1%7D%5Cright)%5E%7B%5Cprime%7D%5Cleft(x_m%5Cright)%5Cright%7C%20%5C%5C%0A=p_%7BX_0%7D%5Cleft(x_0%5Cright)%20%5Cprod_%7Bm=1%7D%5EM%5Cleft%7C%5Cleft(g_m%5Cright)%5E%7B%5Cprime%7D%5Cleft(x_%7Bm-1%7D%5Cright)%5Cright%7C%5E%7B-1%7D%20%5C%5C%0A%5CRightarrow%20%5C%5C%0A%5Clog%20p_%7BX_m%7D%5Cleft(x_m%5Cright)=%5Clog%20p_%7BX_0%7D%5Cleft(x_0%5Cright)+%5Csum%5EM%20%5Clog%20%5Cleft%7C%5Cleft(g_m%5Cright)%5E%7B%5Cprime%7D%5Cleft(x_%7Bm-1%7D%5Cright)%5Cright%7C%5E%7B-1%7D%0A%5Cend%7Bgathered%7D%0A"> which can then be used for maximum likelihood optimization. Notice that we used the <a href="https://en.wikipedia.org/wiki/Inverse_function_theorem?ref=sarem-seitz.com">inverse function theorem</a> to exchange the transformation with its inverse.</p>
<p>Now, all boils down to a reasonable choice of the component-wise transformations.</p>
</section>
<section id="planar-normalizing-flows" class="level3">
<h3 class="anchored" data-anchor-id="planar-normalizing-flows">Planar Normalizing Flows</h3>
<p>A simple type of Normalizing Flows are <strong>Planar Normalizing Flows</strong> defined as <img src="https://latex.codecogs.com/png.latex?%0Ag_%5Ctheta(x)=x+a%20%5Ccdot%20h(w%20x+b)%20%5Cquad%20%5Ctheta=(a,%20w,%20b)%5ET%0A"> where <img src="https://latex.codecogs.com/png.latex?h"> denotes a smooth, non-linear function. The corresponding derivative is <img src="https://latex.codecogs.com/png.latex?%0Ag_%5Ctheta%5E%7B%5Cprime%7D=1+a%20w%20%5Ccdot%20h%5E%7B%5Cprime%7D(w%20x+b)%0A"> This looks quite similar to residual layers in <a href="https://en.wikipedia.org/wiki/Residual_neural_network?ref=sarem-seitz.comhttps://en.wikipedia.org/wiki/Residual_neural_network?ref=sarem-seitz.com">Residual Neural Networks</a>. Thanks to the residual x in the left summand, the requirements on h are more loose than the initial monotonicity assumption. A common choice for the latter is simply the <code>tanh</code> function.</p>
<p>Intuitively, the Planar Flow takes the original variable and adds a non-linear transformation. We can expect the result to resemble the input with some variation, depending on the magnitude of <img src="https://latex.codecogs.com/png.latex?a"> and <img src="https://latex.codecogs.com/png.latex?w">.</p>
<p>To run our experiments, we can use the <a href="https://github.com/TuringLang/Bijectors.jl?ref=sarem-seitz.com">Bijectors.jl</a> package which conveniently contains a Planar Flow layer:</p>
<div id="cell-4" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb1-1">using Distributions, Plots, StatsPlots, Bijectors, Random</span>
<span id="cb1-2"></span>
<span id="cb1-3">Random.seed<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">321</span>)</span>
<span id="cb1-4">baseDist <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MvNormal(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),ones(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#standard Gaussian as vector random variable Bijectors.PlanarLayer expects vector valued r.v.s</span></span>
<span id="cb1-5">planarLayer1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.</span>])</span>
<span id="cb1-6">planarLayer2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>],[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>])</span>
<span id="cb1-7">planarLayer3 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.</span>],[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.</span>])</span>
<span id="cb1-8"></span>
<span id="cb1-9">flowDist1 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> transformed(baseDist, planarLayer1)</span>
<span id="cb1-10">flowDist2 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> transformed(baseDist, planarLayer2)</span>
<span id="cb1-11">flowDist3 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> transformed(baseDist, planarLayer3)</span>
<span id="cb1-12"></span>
<span id="cb1-13">line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Matrix(transpose(collect(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>:<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>)[:,:]))</span>
<span id="cb1-14"></span>
<span id="cb1-15"></span>
<span id="cb1-16">base_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plot(line[:],pdf(baseDist,line)[:],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Standard Gaussian base distribution"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png)</span>
<span id="cb1-17">flow1_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plot(line[:],pdf(flowDist1,line)[:],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Planar Flow - low non-linearity"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png)</span>
<span id="cb1-18">flow2_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plot(line[:],pdf(flowDist2,line)[:],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Planar Flow - medium non-linearity"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png)</span>
<span id="cb1-19">flow3_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plot(line[:],pdf(flowDist3,line)[:],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Planar Flow - strong non-linearity"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png)</span>
<span id="cb1-20"></span>
<span id="cb1-21">flow_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> plot(flow1_plot, flow2_plot, flow3_plot,layout <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>))</span>
<span id="cb1-22"></span>
<span id="cb1-23">plot(base_plot, flow_plot, layout <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>))</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="2">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
</section>
</section>
<section id="non-gaussian-garch-via-planar-normalizing-flows" class="level2">
<h2 class="anchored" data-anchor-id="non-gaussian-garch-via-planar-normalizing-flows">Non-Gaussian GARCH via Planar Normalizing Flows</h2>
<p>By combining GARCH with Normalizing Flows, we aim for two goals:</p>
<ol type="1">
<li><strong>Remove the assumption of conditional Gaussian</strong> realizations while, at the same time</li>
<li><strong>Preserve the autoregressive volatility property</strong> that is inherent to GARCH models</li>
</ol>
<p>For this article, we will focus on a simple GARCH(1,1) model. In an applied setting, we would want to try out different GARCH models and select the best one(s).</p>
<p>Recall that, for GARCH(1,1) with zero mean, we have: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t=%5Cepsilon_t%20%5Csigma_t%20%5Cquad%20%5Cepsilon_t%20%5Csim%20%5Cmathcal%7BN%7D(0,1)%20%5C%5C%0A%5Csigma_t=%5Csqrt%7B%5Cgamma+%5Calpha%20%5Csigma_%7Bt-1%7D%5E2+%5Cbeta%20%5Cepsilon_%7Bt-1%7D%5E2%7D%20%5C%5C%0A%5Cgamma,%20%5Calpha,%20%5Cbeta%20%5Cgeq%200,%20%5Cquad%20%5Calpha+%5Cbeta%3C1%0A%5Cend%7Bgathered%7D%0A"> With the above restrictions, the GARCH model can be shown to be stationary. While the conditional distributions are all Gaussian, the unconditional ones are not.</p>
<p>Now, in order to combine this with Normalizing Flows, we apply the following transformation: <img src="https://latex.codecogs.com/png.latex?%0A%5Ctilde%7By%7D_t=g%5Cleft(y_t%5Cright)%0A"> where <img src="https://latex.codecogs.com/png.latex?g:%20%5Cmathbb%7BR%7D%20%5Cmapsto%20%5Cmathbb%7BR%7D"> is a Normalizing Flow</p>
<p>This choice can be justified by the fact that a 1D Normalizing Flow is a monotone transformation and by the invariance property of quantiles: &gt; Let <img src="https://latex.codecogs.com/png.latex?Q%5Cleft(Y_t,%20r%5Cright)"> denote the <img src="https://latex.codecogs.com/png.latex?r">-quantile of univariate random variable <img src="https://latex.codecogs.com/png.latex?Y_t"> In addition, let <img src="https://latex.codecogs.com/png.latex?g:%20%5Cmathbb%7BR%7D%20%5Cmapsto%20%5Cmathbb%7BR%7D"> denote a monotone transformation then <img src="https://latex.codecogs.com/png.latex?Q%5Cleft(g%5Cleft(Y_t%5Cright),%20r%5Cright)=g%5Cleft(Q%5Cleft(Y_t,%20r%5Cright)%5Cright)"></p>
<p>With this result, we can draw the following conclusions about the transformed GARCH process: <img src="https://latex.codecogs.com/png.latex?%5Ctilde%7By%7D_t"> has constant median, <img src="https://latex.codecogs.com/png.latex?Q%5Cleft(%5Ctilde%7By%7D_t,%200.5%5Cright)=g%5Cleft(Q%5Cleft(y_t,%200.5%5Cright)%5Cright)=g(0)"></p>
<p>If <img src="https://latex.codecogs.com/png.latex?%5Coperatorname%7BVar%7D%5Cleft(y_t%5Cright)%3E%5Coperatorname%7BVar%7D%5Cleft(y_s%5Cright)">, then <img src="https://latex.codecogs.com/png.latex?Q%5Cleft(%5Ctilde%7By%7D_t,%200.05%5Cright)%3CQ%5Cleft(%5Ctilde%7By%7D_s,%200.05%5Cright)"> and <img src="https://latex.codecogs.com/png.latex?Q%5Cleft(%5Ctilde%7By%7D_t,%200.95%5Cright)%3EQ%5Cleft(%5Ctilde%7By%7D_s,%200.95%5Cright)"></p>
<p><img src="https://latex.codecogs.com/png.latex?%5CRightarrow"> This follows via <img src="https://latex.codecogs.com/png.latex?Q%5Cleft(y_t,%20r%5Cright)=%5Csigma_t%20%5CPhi%5E%7B-1%7D(r)"> (where <img src="https://latex.codecogs.com/png.latex?%5CPhi%5E%7B-1%7D(r)"> the <img src="https://latex.codecogs.com/png.latex?r">-th quantile of a standard Normal) and therefore,</p>
<p><img src="https://latex.codecogs.com/png.latex?%0AQ%5Cleft(%5Ctilde%7By%7D_t,%200.05%5Cright)=g%5Cleft(%5Csigma_t%20%5CPhi%5E%7B-1%7D(0.05)%5Cright)%3Cg%5Cleft(%5Csigma_s%20%5CPhi%5E%7B-1%7D(0.05)%5Cright)=Q%5Cleft(%5Ctilde%7By%7D_s,%200.05%5Cright)%0A"></p>
<p>similarly for <img src="https://latex.codecogs.com/png.latex?r=0.95"></p>
<p>Thus, the risk of extreme events for the transformed process moves in conjunction with the underlying GARCH process. We could probably derive results for the variance of the Planar Flow GARCH as well. However, this might potentially turn this post into a full-fledged research paper so we shall content ourselves with the above.</p>
<p>Either way, we now have reassurance that our process will react to random shocks in a similar way as plain GARCH.</p>
<section id="simulating-a-planar-flow-garch11-process" class="level3">
<h3 class="anchored" data-anchor-id="simulating-a-planar-flow-garch11-process">Simulating a Planar Flow GARCH(1,1) process</h3>
<p>After clarifying the fundamentals of our model, we are ready for a quick simulation. For that, let us re-use the highly non-linear Planar Flow from before. The remaining model parameters are drawn from a standard Gaussian and mapped to the correct domains during inference:</p>
<div id="cell-6" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb2-1">using Flux</span>
<span id="cb2-2"></span>
<span id="cb2-3">struct PF_GARCH <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#(=Planar-Flow-GARCH)</span></span>
<span id="cb2-4">    sigma0 <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#prior variance</span></span>
<span id="cb2-5">    gamma</span>
<span id="cb2-6">    alpha</span>
<span id="cb2-7">    beta</span>
<span id="cb2-8">    flow</span>
<span id="cb2-9">end</span>
<span id="cb2-10">Flux.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">@</span>functor PF_GARCH <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#for differentiability later on</span></span>
<span id="cb2-11"></span>
<span id="cb2-12"></span>
<span id="cb2-13">function simulate(m::PF_GARCH, T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span>)</span>
<span id="cb2-14">   </span>
<span id="cb2-15">    gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.gamma[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb2-16">    alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.alpha[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb2-17">    beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.beta[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>alpha) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#constrain alpha and beta to sum to &lt; 1</span></span>
<span id="cb2-18">    </span>
<span id="cb2-19">    sigeps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-20">    sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.sigma0[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb2-21">    </span>
<span id="cb2-22">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb2-23">        sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sqrt(gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-24">        sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> randn()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t]</span>
<span id="cb2-25">    end</span>
<span id="cb2-26">    </span>
<span id="cb2-27">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> m.flow(sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:end])[:]</span>
<span id="cb2-28">    </span>
<span id="cb2-29">end</span>
<span id="cb2-30"></span>
<span id="cb2-31">Random.seed<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb2-32">pf_garch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PF_GARCH(randn(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), randn(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), randn(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), randn(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.</span>],[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2.</span>]))</span>
<span id="cb2-33"></span>
<span id="cb2-34">pf_garch_draw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> simulate(pf_garch)</span>
<span id="cb2-35">gauss_garch_draw <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> inverse(pf_garch.flow)(Matrix(transpose(pf_garch_draw[:,:])))[:]</span>
<span id="cb2-36"></span>
<span id="cb2-37">plot(pf_garch_draw,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Planar Flow GARCH"</span>,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>),fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-38">plot<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(gauss_garch_draw,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Latent Gaussian GARCH"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="3">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>If we take a close look at the graph, we see that Planar Flow GARCH produces values that are either pretty large or pretty low. By scrolling back to the Planar Flow density plots, we see that this makes indeed sense. In fact, the Gaussian-large-non-linear flow generated a bi-modal distribution at around -3 and +3. This matches, approximately, the distribution of values that we see in the Planar Flow GARCH chart.</p>
</section>
<section id="planar-flow-garch-on-a-real-world-time-series" class="level3">
<h3 class="anchored" data-anchor-id="planar-flow-garch-on-a-real-world-time-series">Planar Flow GARCH on a real-world time series</h3>
<p>To validate our model, we can fit it on a stock return time-series and analyze the result. Let us use the Apple adjusted close price as our dataset. I downloaded the data from Yahoo Finance - you can replicate it via <a href="https://de.finance.yahoo.com/quote/AAPL/history?period1=1498262400&amp;period2=1656028800&amp;interval=1d&amp;filter=history&amp;frequency=1d&amp;includeAdjustedClose=true&amp;ref=sarem-seitz.com">this link</a>.</p>
<p>As our target time-series, we use <a href="https://quantivity.wordpress.com/2011/02/21/why-log-returns/?ref=sarem-seitz.com">log-returns</a>. Also, we standardize them by subtracting their mean and dividing by their standard deviation. If you look at the range of values of the sample Planar Flow GARCH, this standardization makes sense as log-returns are typically on a much smaller scale. Afterwards, we can easily rescale our results back to actual log-returns.</p>
<div id="cell-8" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb3-1">using CSV, DataFrames, Flux, Zygote, Distributions</span>
<span id="cb3-2"></span>
<span id="cb3-3">adj_close <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (CSV.File(<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"../data/AAPL.csv"</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">|&gt;</span> DataFrame)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>,<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Adj Close"</span>]</span>
<span id="cb3-4"></span>
<span id="cb3-5">rets <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> diff(log.(adj_close))</span>
<span id="cb3-6"></span>
<span id="cb3-7">ym <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mean(rets)</span>
<span id="cb3-8">ys <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> std(rets)</span>
<span id="cb3-9">rets <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (rets.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>ym).<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>ys</span>
<span id="cb3-10"></span>
<span id="cb3-11">plot(rets, legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:none, title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"AAPL log-returns of adjusted close price (standardized)"</span>, size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>),fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="4">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Next, we define the model log-likelihood and use the ADAM optimizer for first-order optimization. By projecting the parameters via softplus and sigmoid, we can perform unconstrained optimization. Unfortunately, Julias AutoDiff packages all errored on the Hessian matrix. Thus, the current implementation does not seem to permit second-order optimization.</p>
<p>For our implementation, we can mostly rely on Distributions.jl and Bijectors.jl. The latter makes the implementation of nested (deep) Planar Flows quite convenient.</p>
<div id="cell-10" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb4-1">using Zygote</span>
<span id="cb4-2"></span>
<span id="cb4-3">function Distributions.logpdf(m::PF_GARCH,y)</span>
<span id="cb4-4">    </span>
<span id="cb4-5">    T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> size(y,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-6">    </span>
<span id="cb4-7">    inverse_flow <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> inverse(m.flow)</span>
<span id="cb4-8">    ytilde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> inverse_flow(y) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get the underlying Gaussian GARCH </span></span>
<span id="cb4-9">    </span>
<span id="cb4-10">    sigeps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Zygote.Buffer(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#stores sigma_t (1st row) and epsilon_t (2nd row)</span></span>
<span id="cb4-11">    sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#set initial epsilon to zero</span></span>
<span id="cb4-12">    sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.sigma0[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb4-13">    </span>
<span id="cb4-14">    gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.gamma[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb4-15">    alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.alpha[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb4-16">    beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.beta[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>alpha) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#constrain alpha and beta to sum to &lt; 1</span></span>
<span id="cb4-17">    </span>
<span id="cb4-18">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb4-19">        sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sqrt(gamma[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#sigma_t</span></span>
<span id="cb4-20">        sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ytilde[t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>sigeps[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t] <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#epsilon_t</span></span>
<span id="cb4-21">    end</span>
<span id="cb4-22">    </span>
<span id="cb4-23">    <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">vars</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> copy(sigeps)</span>
<span id="cb4-24"></span>
<span id="cb4-25">    dists <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>MvNormal(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),x),<span class="bu" style="color: null;
background-color: null;
font-style: inherit;">vars</span>[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:end].<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-26">    flows <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.unsqueeze(transformed.(dists,m.flow),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb4-27">    </span>
<span id="cb4-28">    lpdfs <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Zygote.Buffer(zeros(T),T)</span>
<span id="cb4-29"></span>
<span id="cb4-30">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:T</span>
<span id="cb4-31">        lpdfs[t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> logpdf(flows[t],[y[t]]) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Bijectors.Composed flow expects vector valued variables</span></span>
<span id="cb4-32">    end</span>
<span id="cb4-33"></span>
<span id="cb4-34">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> mean(copy(lpdfs))</span>
<span id="cb4-35">end</span>
<span id="cb4-36"></span>
<span id="cb4-37"></span>
<span id="cb4-38">retsm <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Matrix(transpose(rets[:,:])) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Bijectors.jl flows treat 1xN matrices as N observations from a single-value vector valued r.v.</span></span>
<span id="cb4-39"></span>
<span id="cb4-40">pf_garch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> PF_GARCH(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>), PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>])∘PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>])∘PlanarLayer([<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1.</span>],[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span>]))</span>
<span id="cb4-41"></span>
<span id="cb4-42"></span>
<span id="cb4-43">params <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.params(pf_garch)</span>
<span id="cb4-44">opt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ADAM(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>)</span>
<span id="cb4-45"></span>
<span id="cb4-46"></span>
<span id="cb4-47"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span> <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#first-order optimization takes quite some time</span></span>
<span id="cb4-48">    grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Zygote.gradient(()<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;-</span>logpdf(pf_garch,retsm),params)</span>
<span id="cb4-49">    Flux.Optimise.update<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(opt,params,grads)</span>
<span id="cb4-50">end<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">;</span></span></code></pre></div>
</div>
</section>
<section id="the-planar-flow-garch-model-after-optimization" class="level3">
<h3 class="anchored" data-anchor-id="the-planar-flow-garch-model-after-optimization">The Planar Flow GARCH model after optimization</h3>
<p>In order to check the outcome, let us plot several perspective. First, we start with in-sample and out-of-sample (=forecast) point predictions and predictive intervals. For the former, we use the median as we can calculate it analytically by applying the Normalizing Flow to the Gaussian median. Similarly, we can derive the 5% and 95% quantiles of the transformed variable via the respective Gaussian quantiles.</p>
<p>Additionally, notice that we need to integrate out the unrealized noise terms for the forecast distributions, i.e.: <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(%5Ctilde%7By%7D_t%5Cright)=%5Cint%20p%5Cleft(%5Ctilde%7By%7D_t%20%5Cmid%20%5Cepsilon_t,%20%5Cepsilon_%7Bt-1%7D%5Cright)%20p%5Cleft(%5Cepsilon_t%5Cright)%20p%5Cleft(%5Cepsilon_%7Bt-1%7D%5Cright)%20d%20%5Cepsilon_t%20d%20%5Cepsilon_%7Bt-1%7D%0A"> if <img src="https://latex.codecogs.com/png.latex?t-1"> lies inside the forecast interval</p>
<p>As this would be tedious, we will use our model to sample from the forecast distribution and integrate the noise out implicitly.</p>
<div id="cell-12" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb5-1">function get_insample_distributions(m::PF_GARCH, y)</span>
<span id="cb5-2">    </span>
<span id="cb5-3">    T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> size(y,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-4">    </span>
<span id="cb5-5">    inverse_flow <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> inverse(m.flow)</span>
<span id="cb5-6">    ytilde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> inverse_flow(y)</span>
<span id="cb5-7">    </span>
<span id="cb5-8">    gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.gamma[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb5-9">    alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.alpha[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb5-10">    beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.beta[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>alpha) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#constrain alpha and beta to sum to &lt; 1</span></span>
<span id="cb5-11">    </span>
<span id="cb5-12">    sigeps_insample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#stores sigma_t (1st row) and epsilon_t (2nd row)</span></span>
<span id="cb5-13">    sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.sigma0[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb5-14">    </span>
<span id="cb5-15">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb5-16">        sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sqrt(gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-17">        sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> ytilde[t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>(sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>)</span>
<span id="cb5-18">    end</span>
<span id="cb5-19">    </span>
<span id="cb5-20">    dists_insample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>MvNormal(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),x),sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:end].<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-21">    flows_insample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.unsqueeze(transformed.(dists_insample,m.flow),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb5-22">    </span>
<span id="cb5-23">    dists_insample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>MvNormal(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),x),sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:end].<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-24">    flows_insample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.unsqueeze(transformed.(dists_insample,m.flow),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb5-25">    </span>
<span id="cb5-26">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> flows_insample, sigeps_insample</span>
<span id="cb5-27">end</span>
<span id="cb5-28"></span>
<span id="cb5-29"></span>
<span id="cb5-30"></span>
<span id="cb5-31">function sample_forecast(m::PF_GARCH, sigeps_insample, forecast_periods<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">60</span>)</span>
<span id="cb5-32">    </span>
<span id="cb5-33">    gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> softplus(m.gamma[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb5-34">    alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.alpha[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb5-35">    beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> σ(m.beta[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>alpha) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#constrain alpha and beta to sum to &lt; 1</span></span>
<span id="cb5-36">    </span>
<span id="cb5-37">    sigeps_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,forecast_periods<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#stores sigma_t (1st row) and epsilon_t (2nd row)</span></span>
<span id="cb5-38">    sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,end]</span>
<span id="cb5-39">    sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigeps_insample[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,end]</span>
<span id="cb5-40">    </span>
<span id="cb5-41">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>:forecast_periods<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb5-42">        sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sqrt(gamma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> alpha <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> beta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-43">        sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,t] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> randn()</span>
<span id="cb5-44">    end</span>
<span id="cb5-45">    </span>
<span id="cb5-46">    dists_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">map</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>MvNormal(zeros(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),x),sigeps_forecast[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,:].<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb5-47">    flows_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.unsqueeze(transformed.(dists_forecast,m.flow),<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb5-48">    </span>
<span id="cb5-49">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> vcat(rand.(flows_forecast)[:]...)</span>
<span id="cb5-50">    </span>
<span id="cb5-51">end</span>
<span id="cb5-52"></span>
<span id="cb5-53"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#--------------------------</span></span>
<span id="cb5-54"></span>
<span id="cb5-55">pf_garch_insample, sigeps_insample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> get_insample_distributions(pf_garch,retsm)</span>
<span id="cb5-56"></span>
<span id="cb5-57">pf_garch_5perc_quantile <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [r.transform([quantile(Normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,sqrt(r.dist.Σ[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>)])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> r <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> pf_garch_insample][:] .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym</span>
<span id="cb5-58">pf_garch_95perc_quantile <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [r.transform([quantile(Normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,sqrt(r.dist.Σ[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>)])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> r <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> pf_garch_insample][:] .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym</span>
<span id="cb5-59">pf_garch_median <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [r.transform([quantile(Normal(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>,sqrt(r.dist.Σ[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)])[<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> r <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> pf_garch_insample][:].<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym</span>
<span id="cb5-60"></span>
<span id="cb5-61"></span>
<span id="cb5-62">forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> hcat([sample_forecast(pf_garch, sigeps_insample) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">75000</span>]...) .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym</span>
<span id="cb5-63"></span>
<span id="cb5-64">forecast_5perc_quantile <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mapslices(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>quantile(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>),forecast,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[:]</span>
<span id="cb5-65">forecast_95perc_quantile <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mapslices(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>quantile(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>),forecast,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[:]</span>
<span id="cb5-66">forecast_median <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> mapslices(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span>quantile(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>),forecast,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[:]</span>
<span id="cb5-67"></span>
<span id="cb5-68"></span>
<span id="cb5-69">plot(pf_garch_median,</span>
<span id="cb5-70">     ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (pf_garch_median.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>pf_garch_5perc_quantile,pf_garch_95perc_quantile.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>pf_garch_median),</span>
<span id="cb5-71">     lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>:png, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Insample point and interval predictions"</span>)</span>
<span id="cb5-72">plot<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(collect(length(rets):<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">60</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>length(rets)),forecast_median,ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (forecast_median.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>forecast_5perc_quantile,forecast_95perc_quantile.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>forecast_median),</span>
<span id="cb5-73">      lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"green"</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"60 days ahead forecast"</span>)</span>
<span id="cb5-74">plot<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(rets.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym, alpha<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"red"</span>,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Realized returns"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="9">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Both in- and out-of-sample predictive intervals and point predictions look reasonable. In case we preferred to use the mean instead of the median as the point estimate, we could, for example, use Monte Carlo samples to estimate the former.</p>
<p>In a real-world scenario, we might want to take a closer look at the forecast predictive interval. As a matter of fact, it appears a little too small and might actually under-estimate potential risk. Apart from that, however, the model seems to correctly depict the typical GARCH volatility clusters.</p>
<p>Finally, let us check the conditional distributions after the largest and smallest amplitudes. This gives us a visual impression of how far our model deviates from Gaussian conditional returns as in standard GARCH.</p>
<div id="cell-14" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb6-1">plot(collect(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>:<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>).<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym,[pdf(pf_garch_insample[argmin(rets.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],[x]) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> x <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> collect(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>:<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)],</span>
<span id="cb6-2">     size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional distribution at t+1 after smallest return amplitude"</span>, legendfontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>)</span>
<span id="cb6-3"></span>
<span id="cb6-4">plot<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">!</span>(collect(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>:<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>).<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ys .<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> ym,[pdf(pf_garch_insample[argmax(rets.<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],[x]) <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> x <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> collect(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>:<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>:<span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)],</span>
<span id="cb6-5">     label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Conditional distribution at t+1 after highest return amplitude"</span>, legendfontsize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="dv" style="color: #AD0000;
background-color: null;
font-style: inherit;">7</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="10">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows_files/figure-html/cell-7-output-1.svg" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Indeed, our model is highly non-Gaussian. Also, as expected, the conditional return distribution clearly spreads out after large returns amplitudes - i.e.&nbsp;volatility increases after shocks.</p>
</section>
</section>
<section id="garch-and-normalizing-flows---conclusion" class="level2">
<h2 class="anchored" data-anchor-id="garch-and-normalizing-flows---conclusion">GARCH and Normalizing Flows - Conclusion</h2>
<p>In this article, we took a timeless model from quantitative finance and combined it with a popular machine learning approach. The outcome is quite interesting insofar as it can infer conditional return distribution from data. This is in contrast to classic statistical models where the user typically fixes the distribution ex-ante. As long as the time-series is long enough, this might yield better predictive results. After all, Gaussian and related distributional assumptions often simplify the real world too far.</p>
<p>To improve the model further from here, we might want to consider more complex GARCH dynamics. The GARCH(1,1) was mainly for convenience and higher order GARCH models would likely be better suited.</p>
<p>Besides that, we could use a more sophisticated version of GARCH altogether. You might want to take a look at <a href="https://en.wikipedia.org/wiki/Autoregressive_conditional_heteroskedasticity?ref=sarem-seitz.com">Wikipedia</a> for a non-exhaustive list of advanced versions of GARCH. Finally, we could also replace the Planar Flows with more advanced alternatives. As an example, consider <a href="https://arxiv.org/pdf/1803.05649.pdf?ref=sarem-seitz.com">Sylvester Flows</a> which generalize Planar Flows.</p>
<p>As always though, we should not fool ourselves by choosing the complex ML-ish approach just because that is trendy. Rather, all candidate models should be carefully evaluated against each other. Nevertheless, it would be interesting to check if this approach could be useful for an actual real-world trading strategy.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Bollerslev, Tim. Generalized autoregressive conditional heteroskedasticity. Journal of econometrics, 1986, 31. 3, p.&nbsp;307-327.</p>
<p><strong>[2]</strong> Rezende, Danilo; Mohamed, Shakir. Variational inference with normalizing flows. In: International conference on machine learning. PMLR, 2015. p.&nbsp;1530-1538.</p>
<p><strong>[3]</strong> Kobyzev, Ivan; Prince, Simon JD; Brubaker, Marcus A. Normalizing flows: An introduction and review of current methods. IEEE transactions on pattern analysis and machine intelligence, 2020, 43. 11, p.&nbsp;3964-3979.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/lets-make-garch-more-flexible-with-normalizing-flows.html</guid>
  <pubDate>Mon, 27 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>ARMA forecasting for non-Gaussian time-series data using Copulas</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>ARMA (<a href="https://en.wikipedia.org/wiki/Autoregressive%E2%80%93moving-average_model?ref=sarem-seitz.com">AutoRegressive – Moving Average</a>) models are arguably the most popular approach to time-series forecasting. Unfortunately, plain ARMA is made for Gaussian distributed data only. On the one hand, you can often still use ARMA by transforming the raw data. On the other hand, this typically makes probabilistic forecasts quite tedious.</p>
<p>One approach to apply ARMA to non-Normal data are Copula models. Roughly, the latter allow us to exchange the Gaussian marginal for any other continuous distribution. At the same time, they preserve the implicit time-dependency between observations that is imposed by ARMA.</p>
<p>If this sounds confusing, I suggest reading the next paragraph carefully. Also, you might want to read some external sources for a deeper understanding, too.</p>
</section>
<section id="what-are-copulas-and-how-can-we-use-them-with-arma" class="level2">
<h2 class="anchored" data-anchor-id="what-are-copulas-and-how-can-we-use-them-with-arma">What are Copulas and how can we use them with ARMA?</h2>
<p>Informally, Copulas (or Copulae if you are a Latin hardliner) define joint cumulative distribution functions (c.d.f.) for unit-uniform random variables. Formally, we can describe this as <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AC%5Cleft(u_1,%20%5Cldots,%20u_m%5Cright)=P%5Cleft(U_1%20%5Cleq%20u_1,%20%5Cldots,%20U_m%20%5Cleq%20u_m%5Cright)%20%5C%5C%0AU_1,%20%5Cldots,%20U_M%20%5Cstackrel%7B%5Ctext%20%7B%20i.i.d%20%7D%7D%7B%5Csim%7D%20%5Cmathcal%7BU%7D(0,1)%0A%5Cend%7Bgathered%7D%0A"> That property alone is quite unspectacular as uniform random variables are not very expressive for practical problems. However, an important result in probability theory will make things more interesting.</p>
<p>The <a href="http://probability%20integral%20transform/">probability integral transform</a> states that we can transform any continuous random variable to a uniform one by plugging it into its own c.d.f.:</p>
<blockquote class="blockquote">
<p>Let <img src="https://latex.codecogs.com/png.latex?X"> be continuous with c.d.f. <img src="https://latex.codecogs.com/png.latex?F_X(x)">, then <img src="https://latex.codecogs.com/png.latex?F_X(X)%20%5Csim%20%5Cmathcal%7BU%7D(0,1)"></p>
</blockquote>
<p>We can verify this empirically for a standard Normal example:</p>
<div id="cell-4" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Distributions</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Plots</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">StatsPlots</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span></span>
<span id="cb1-2"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb1-3"></span>
<span id="cb1-4">sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)</span>
<span id="cb1-5">transformed_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(), sample)</span>
<span id="cb1-6"></span>
<span id="cb1-7">line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)</span>
<span id="cb1-8">line_transformed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb1-9"></span>
<span id="cb1-10">p_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">histogram</span>(sample,normalize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">true</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gaussian sample"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb1-11"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(p_sample, line, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),line),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Theoretical density"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb1-12"></span>
<span id="cb1-13">p_transformed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">histogram</span>(transformed_sample,normalize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">true</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>bottomright,title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Transformed sample"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb1-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(p_transformed, line_transformed, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Uniform</span>(),line_transformed),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Theoretical density"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb1-15"></span>
<span id="cb1-16"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(p_sample,p_transformed,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>),fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="1">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas_files/figure-html/cell-2-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>As the inverse of a c.d.f. is the quantile function, we can easily invert this transformation. Even cooler, we can transform a uniform random variable to any continuous random variable via <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AP%5Cleft(F_X%5E%7B-1%7D(U)%20%5Cleq%20x%5Cright)=F_X(x)%20%5C%5C%0AF_X%5E%7B-1%7D(%5Ccdot):=%5Ctext%20%7B%20inverse%20(quantile)%20function%20of%20%7D%20F_X(%5Ccdot)%0A%5Cend%7Bgathered%7D%0A"> This inverse transformation will become relevant later on.</p>
<section id="combining-copulas-and-the-inverse-probability-transform" class="level3">
<h3 class="anchored" data-anchor-id="combining-copulas-and-the-inverse-probability-transform">Combining Copulas and the inverse probability transform</h3>
<p>In conjunction with Copulas, this allows us to separate the marginal distributions from the dependency structure of joint random variables.</p>
<p>A concrete example: Consider two random variables, <img src="https://latex.codecogs.com/png.latex?X"> and <img src="https://latex.codecogs.com/png.latex?Y"> with standard Gamma and Beta marginal distributions, i.e. <img src="https://latex.codecogs.com/png.latex?%0AX%20%5Csim%20%5CGamma(1,1),%20%5Cquad%20Y%20%5Csim%20%5Cmathcal%7BB%7D(1,1)%0A"> With the help of a Copula and the probability integral transform, we can now define a joint c.d.f over both variables such that we preserve their marginal distributions: <img src="https://latex.codecogs.com/png.latex?%0AP(X%20%5Cleq%20x,%20Y%20%5Cleq%20y)=C%5Cleft(F_X(x),%20F_Y(y)%5Cright)%0A"></p>
</section>
<section id="introducing-the-gaussian-copula" class="level3">
<h3 class="anchored" data-anchor-id="introducing-the-gaussian-copula">Introducing the Gaussian Copula</h3>
<p>So far, we haven’t specified any Copula function yet. A simplistic one is the Gaussian Copula, which is defined as follows: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AC_%7B%5Ctext%20%7BGauss%20%7D%7D%5Cleft(u_1,%20%5Cldots,%20u_m%20;%20R%5Cright)=%5CPhi_R%5Cleft(%5CPhi%5E%7B-1%7D%5Cleft(u_1%5Cright),%20%5Cldots,%20%5CPhi%5E%7B-1%7D%5Cleft(u_m%5Cright)%5Cright)%5C%5C%5C%5C%0A%5CPhi_R(%5Ccdot,%20%5Cldots,%20%5Ccdot):=%20%5Ctext%7BJoint%20c.d.f.%20of%20multivariate%20Gaussian%20s.t.%7D%5C%5C%0A%5Cmu=0,%20%5CSigma=R%20%5C%5C%0A%5Coperatorname%7Bdiag%7D(R)=1%5C%5C%5C%5C%0A%5CPhi%5E%7B-1%7D(%5Ccdot):=%20%5Ctext%7BQuantile%20function%20of%20standard%20Gaussian%7D%0A%5Cend%7Bgathered%7D%0A"> If we combine this with the Gamma-Beta example from before, we get the following Gaussian Copula joint c.d.f.: <img src="https://latex.codecogs.com/png.latex?%0AP(X%20%5Cleq%20x,%20Y%20%5Cleq%20y)=%5CPhi_R%5Cleft(%5CPhi%5E%7B-1%7D%5Cleft(F_X(x)%5Cright),%20%5CPhi%5E%7B-1%7D%5Cleft(F_Y(y)%5Cright)%5Cright)%0A"> The implicit rationale behind this approach can be described in three steps:</p>
<ol type="1">
<li>Transform the <strong>Gamma and Beta marginals into Uniform marginals</strong> via the respective <strong>c.d.f.s</strong></li>
<li>Transform the <strong>Uniform marginals into standard Normal marginals</strong> via the <strong>quantile functions</strong></li>
<li>Define the <strong>joint distribution via the multivariate Gaussian c.d.f.</strong> with zero mean, unit variance and non-zero covariance (covariance matrix R)</li>
</ol>
<p>By inverting these steps, we can easily sample from a bi-variate random variable that has the above properties. I.e. standard Gamma/Beta marginals with Gaussian Copula dependencies:</p>
<ol type="1">
<li>Draw a sample from a bi-variate Gaussian with mean zero, unit variance and non-zero covariance (covariance matrix R). You now have <strong>two correlated standard Gaussian variables</strong>.</li>
<li>Transform both variables with the standard Gaussian c.d.f. - you now have <strong>two correlated Uniform variables</strong>. (= probability integral transform)</li>
<li>Transform these variables with the standard Beta and Gamma quantile functions - you now have <strong>a pair of correlated Gamma-Beta variables</strong>. (= inverse probability integral transform)</li>
</ol>
<p>Notice that we could drop the zero-mean, unit-variance assumption on the multivariate Gaussian. In that case we would have to adjust the Gaussian c.d.f. to the corresponding marginals in order to keep the integral probability transform valid.</p>
<p>Since we are only interested in the dependency structure (i.e.&nbsp;covariances), standard Gaussian marginals are sufficient and easier to deal with.</p>
<p>Now let us sample some data in Julia:</p>
<div id="cell-8" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb2-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Measures</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span></span>
<span id="cb2-2"></span>
<span id="cb2-3"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb2-4"></span>
<span id="cb2-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Step 1: Sample bi-variate Gaussian data with zero mean and unit variance</span></span>
<span id="cb2-6">mu <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb2-7">R <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>; <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb2-8">sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MvNormal</span>(mu,R),<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span>)</span>
<span id="cb2-9"></span>
<span id="cb2-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Step 2: Transform the data via the standard Gaussian c.d.f.</span></span>
<span id="cb2-11">sample_uniform <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(), sample)</span>
<span id="cb2-12"></span>
<span id="cb2-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Step 3: Transform the uniform marginals via the standard Gamma/Beta quantile functions</span></span>
<span id="cb2-14">sample_transformed <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sample_uniform</span>
<span id="cb2-15">sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Gamma</span>(),sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>])</span>
<span id="cb2-16">sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Beta</span>(),sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>])</span>
<span id="cb2-17"></span>
<span id="cb2-18"></span>
<span id="cb2-19"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Plot the result</span></span>
<span id="cb2-20">scatterplot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter</span>(sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Joint sample"</span>,</span>
<span id="cb2-21">                      legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png,xlab<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gamma marginal"</span>, ylab<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Beta marginal"</span>)</span>
<span id="cb2-22"></span>
<span id="cb2-23">gamma_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10</span>)</span>
<span id="cb2-24">g_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">histogram</span>(sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],normalize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">true</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Gamma marginal"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb2-25"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(g_plot, gamma_line, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Gamma</span>(),gamma_line),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Theoretical density"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb2-26"></span>
<span id="cb2-27">beta_line <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb2-28">b_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">histogram</span>(sample_transformed[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],normalize<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="cn" style="color: #8f5902;
background-color: null;
font-style: inherit;">true</span>, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>bottomright,title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Beta marginal"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb2-29"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(b_plot, beta_line, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Beta</span>(),beta_line),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>,label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Theoretical density"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb2-30"></span>
<span id="cb2-31"></span>
<span id="cb2-32"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(scatterplot,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(g_plot,b_plot),layout<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>),fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png,margin<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">7.5</span>mm)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="5">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Congratulations, you have just sampled from your first Copula model!</p>
</section>
<section id="but-wait---i-want-to-fit-a-model" class="level3">
<h3 class="anchored" data-anchor-id="but-wait---i-want-to-fit-a-model">But wait - I want to fit a model!</h3>
<p>Let’s say we observed the above data without the underlying generating process. We only presume that we know that Gamma-Beta marginals and a Gaussian copula are a good choice. How could we fit the model parameters (i.e.&nbsp;‘learn’ them in the Machine Learning world)?</p>
<p>As often for statistical models, <a href="https://en.wikipedia.org/wiki/Maximum_likelihood_estimation?ref=sarem-seitz.com">Maximum Likelihood</a> is a good approach. However, we need a density function for that, so what do we do? We already found out, that a Copula model describes a valid c.d.f. for continuous marginals. Thus, we can derive the corresponding probability density by taking derivatives: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap%5Cleft(x_1,%20%5Cldots,%20x_m%5Cright)%20%5C%5C%0A=%5Cfrac%7B%5Cpartial%5Em%7D%7B%5Cpartial%20x_1%20%5Ccdots%20%5Cpartial%20x_m%7D%20F%5Cleft(x_1,%20%5Cldots,%20x_m%5Cright)%20%5C%5C%0A=%5Cfrac%7B%5Cpartial%5Em%7D%7B%5Cpartial%20x_1%20%5Ccdots%20%5Cpartial%20x_m%7D%20C%5Cleft(F%5Cleft(x_1%5Cright),%20%5Cldots,%20F%5Cleft(x_m%5Cright)%5Cright)%20%5C%5C%0A=c%5Cleft(F%5Cleft(x_1%5Cright),%20%5Cldots,%20F%5Cleft(x_m%5Cright)%5Cright)%20%5Ccdot%20%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20x_1%7D%20F%5Cleft(x_1%5Cright)%20%5Ccdots%20%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20x_m%7D%20F%5Cleft(x_m%5Cright)%20%5C%5C%0A=c%5Cleft(F%5Cleft(x_1%5Cright),%20%5Cldots,%20F%5Cleft(x_m%5Cright)%5Cright)%20%5Ccdot%20p%5Cleft(x_1%5Cright)%20%5Ccdots%20p%5Cleft(x_m%5Cright)%0A%5Cend%7Bgathered%7D%0A"> (where <img src="https://latex.codecogs.com/png.latex?c(%5Ccdot,%20%5Cldots,%20%5Ccdot)"> is called a ‘Copula density function’; <img src="https://latex.codecogs.com/png.latex?p(%5Ccdot)"> denotes a probability density function)</p>
<p>Now, for the Gaussian Copula, one can prove the following Copula density function: <img src="https://latex.codecogs.com/png.latex?%0Ac_%7B%5Ctext%20%7BGauss%20%7D%7D%5Cleft(u_1,%20%5Cldots,%20u_m%20;%20R%5Cright)=%5Cfrac%7B1%7D%7B%5Csqrt%7B%7CR%7C%7D%7D%20%5Cexp%20%5Cleft(-0.5%5Cleft(%5Cbegin%7Barray%7D%7Bc%7D%0A%5CPhi%5E%7B-1%7D%5Cleft(u_1%5Cright)%20%5C%5C%0A%5Cvdots%20%5C%5C%0A%5CPhi%5E%7B-1%7D%5Cleft(u_m%5Cright)%0A%5Cend%7Barray%7D%5Cright)%5ET%20%5Ccdot%5Cleft(R%5E%7B-1%7D-I%5Cright)%20%5Ccdot%5Cleft(%5Cbegin%7Barray%7D%7Bc%7D%0A%5CPhi%5E%7B-1%7D%5Cleft(u_1%5Cright)%20%5C%5C%0A%5Cvdots%20%5C%5C%0A%5CPhi%5E%7B-1%7D%5Cleft(u_m%5Cright)%0A%5Cend%7Barray%7D%5Cright)%5Cright)%0A"></p>
</section>
</section>
<section id="arma-with-non-normal-data-via-copulas" class="level2">
<h2 class="anchored" data-anchor-id="arma-with-non-normal-data-via-copulas">ARMA with non-normal data via Copulas</h2>
<p>Finally, we can return to our initial problem. For this example, we will focus on the stationary ARMA(1,1) model: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ay_t=%5Cphi%20y_%7Bt-1%7D+%5Ctheta%20%5Cepsilon_%7Bt-1%7D+%5Cepsilon_t%20%5C%5C%0A%5Cepsilon_t%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5Csigma%5E2%5Cright)%20%5C%5C%0A%7C%5Cphi%7C%3C1,%20%5Cquad%7C%5Ctheta%7C%3C1%0A%5Cend%7Bgathered%7D%0A"> For a time-series with <img src="https://latex.codecogs.com/png.latex?T"> observations, we can derive the unconditional, stationary distribution (see e.g.&nbsp;<a href="https://math.stackexchange.com/questions/1265466/the-autocovariance-function-of-arma1-1?ref=sarem-seitz.com">here</a>): <img src="https://latex.codecogs.com/png.latex?%0A%5Cleft(%5Cbegin%7Barray%7D%7Bc%7D%0Ay_1%20%5C%5C%0A%5Cvdots%20%5C%5C%0Ay_T%0A%5Cend%7Barray%7D%5Cright)%20%5Csim%20%5Cmathcal%7BN%7D%5Cleft(0,%20%5CSigma=%5Cleft%5B%5Cbegin%7Barray%7D%7Bcccc%7D%0A%5Cgamma(0)%20&amp;%20%5Cgamma(1)%20&amp;%20%5Ccdots%20&amp;%20%5Cgamma(T-1)%20%5C%5C%0A%5Cgamma(1)%20&amp;%20%5Cgamma(0)%20&amp;%20%5Ccdots%20&amp;%20%5Cgamma(T-2)%20%5C%5C%0A%5Cvdots%20&amp;%20%5Cvdots%20&amp;%20%5Cddots%20&amp;%20%5Cvdots%20%5C%5C%0A%5Cgamma(T-1)%20&amp;%20%5Cgamma(T-2)%20&amp;%20%5Ccdots%20&amp;%20%5Cgamma(0)%0A%5Cend%7Barray%7D%5Cright%5D%5Cright)%0A"> where <img src="https://latex.codecogs.com/png.latex?%5Cgamma(h)"> the ARMA <img src="https://latex.codecogs.com/png.latex?(1,1)"> auto-covariance function for lag <img src="https://latex.codecogs.com/png.latex?h"> : <img src="https://latex.codecogs.com/png.latex?%0A%5Cgamma(h)=%20%5Cbegin%7Bcases%7D%5Csigma%5E2%5Cleft(1+%5Cfrac%7B(%5Cphi+%5Ctheta)%5E2%7D%7B1-%5Ctheta%5E2%7D%5Cright)%20&amp;%20h=0%20%5C%5C%20%5Csigma%5E2%5Cleft((%5Cphi+%5Ctheta)%20%5Cphi%5E%7Bh-1%7D+%5Cfrac%7B(%5Cphi+%5Ctheta)%5E2%20%5Cphi%5Eh%7D%7B1-%5Cphi%5E2%7D%5Cright)%20&amp;%20h%3E0%5Cend%7Bcases%7D%0A"> Informally, the unconditional distribution considers a fixed-length time-series as a single, multivariate random vector. As a consequence, it doesn’t matter whether we are sampling from the unconditional distribution or the usual ARMA equations (for an equally long time-series) themselves.</p>
<p>In some instances, such as this one, the unconditional distribution is easier to work with.</p>
<p>Also, notice that the unconditional marginal distributions (the distributions of the y_t’s) are the same regardless of the time-lag we are looking at. In fact, we have zero-mean Gaussians with variance equal to the auto-covariance function at zero.</p>
<p>Next, let us define: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0AG=%5Cfrac%7B1%7D%7B%5Cgamma(0)%7D%20%5Ccdot%20I%20%5C%5C%0A%5Ctilde%7B%5CSigma%7D=G%20%5CSigma%20%5C%5C%0A%5CRightarrow%20%5Coperatorname%7Bdiag%7D(%5Ctilde%7B%5CSigma%7D)=1%0A%5Cend%7Bgathered%7D%0A"> The transformed covariance matrix now implies unit variance while preserving the dependency structure of the unconditional time-series. Literally, we have just derived the <strong>correlation matrix</strong> but let us stick to the idea of a <strong>standardized covariance matrix</strong>.</p>
<p>If we plug this back into a Gaussian copula, we obtain what we could call an ARMA(1,1) Copula. Now, we could use the ARMA(1,1) Copula dependency structure together with any continuous marginal distribution. For example, we could define <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap%5Cleft(y_t%5Cright)=%5Coperatorname%7BExp%7D%5Cleft(y_t%20%5Cmid%200.5%5Cright)%20%5C%5C%0A=%5Cfrac%7B1%7D%7B0.5%7D%20e%5E%7B-%5Cfrac%7By%20t%7D%7B0.5%7D%7D%20%5C%5C%0Ay_t%3E0%0A%5Cend%7Bgathered%7D%0A"> i.e.&nbsp;the unconditional marginals are Exponential-distributed with rate parameter 0.5. Putting everything together, we obtain the following unconditional density: <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(%5Cbegin%7Barray%7D%7Bc%7D%0Ay_1%20%5C%5C%0A%5Cvdots%20%5C%5C%0Ay_T%0A%5Cend%7Barray%7D%5Cright)=%5Cprod_%7Bt=1%7D%5ET%20%5Coperatorname%7BExp%7D%5Cleft(y_t%20%5Cmid%200.5%5Cright)%20%5Ccdot%20c_%7B%5Ctext%20%7BGauss%20%7D%7D%5Cleft(F_%7B%5Coperatorname%7BExp%7D(0.5)%7D%5Cleft(y_1%5Cright),%20%5Cldots,%20F_%7B%5Coperatorname%7BExp%7D(0.5)%7D%5Cleft(y_T%5Cright)%20;%20G%20%5CSigma%20G%5Cright)%0A"> Let us combine everything so far and plot an example:</p>
<div id="cell-12" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb3-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">LinearAlgebra</span></span>
<span id="cb3-2"></span>
<span id="cb3-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">struct</span> ARMA_1_1</span>
<span id="cb3-4">    </span>
<span id="cb3-5">    phi</span>
<span id="cb3-6">    theta</span>
<span id="cb3-7">    sigma</span>
<span id="cb3-8">    </span>
<span id="cb3-9"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-10"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Broadcast</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">broadcastable</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">ARMA_1_1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (m,)</span>
<span id="cb3-11"></span>
<span id="cb3-12"></span>
<span id="cb3-13"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_autocovariance_matrix</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">ARMA_1_1</span>,T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100</span>)    </span>
<span id="cb3-14">    autocovariance_matrix <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_autocovariance</span>.(m, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_time_matrix</span>(T))</span>
<span id="cb3-15">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> autocovariance_matrix</span>
<span id="cb3-16"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-17"></span>
<span id="cb3-18"></span>
<span id="cb3-19"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_time_matrix</span>(T)</span>
<span id="cb3-20">    times <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-21">    </span>
<span id="cb3-22">    time_matrix <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(T,T)</span>
<span id="cb3-23">    </span>
<span id="cb3-24">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T</span>
<span id="cb3-25">        time_matrix[t,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">reverse</span>(times[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>t])</span>
<span id="cb3-26">        time_matrix[t,t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> times[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] </span>
<span id="cb3-27">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-28">    </span>
<span id="cb3-29">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> time_matrix</span>
<span id="cb3-30"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-31"></span>
<span id="cb3-32"></span>
<span id="cb3-33"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">get_autocovariance</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">ARMA_1_1</span>,h)        </span>
<span id="cb3-34">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> h <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb3-35">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> m.sigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (m.phi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> m.theta)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> m.phi<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb3-36">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">else</span></span>
<span id="cb3-37">        <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> m.sigma<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> ((m.phi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> m.theta)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>m.phi<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>(h<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> (m.phi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> m.theta)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>m.phi<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span>h <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> m.phi<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))</span>
<span id="cb3-38">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-39"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-40"></span>
<span id="cb3-41"></span>
<span id="cb3-42"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">normalize_covariance</span>(Sigma)</span>
<span id="cb3-43">    G <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">./</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">diag</span>(Sigma))</span>
<span id="cb3-44">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> G<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>Sigma</span>
<span id="cb3-45"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-46"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#-------------------------</span></span>
<span id="cb3-47"></span>
<span id="cb3-48"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">123</span>)</span>
<span id="cb3-49">T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span></span>
<span id="cb3-50"></span>
<span id="cb3-51">arma_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ARMA_1_1</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.75</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb3-52">Sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_autocovariance_matrix</span>(arma_model,T)</span>
<span id="cb3-53">Sigma_tilde <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">normalize_covariance</span>(Sigma)</span>
<span id="cb3-54"></span>
<span id="cb3-55">unconditional <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MvNormal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(T),Sigma_tilde)</span>
<span id="cb3-56"></span>
<span id="cb3-57">arma_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(unconditional)</span>
<span id="cb3-58"></span>
<span id="cb3-59"></span>
<span id="cb3-60">exp_target <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Exponential</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>)</span>
<span id="cb3-61">exp_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(exp_target, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),arma_sample))</span>
<span id="cb3-62"></span>
<span id="cb3-63"></span>
<span id="cb3-64"></span>
<span id="cb3-65">arma_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(arma_sample,legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"ARMA(1,1) sample (standardized covariance matrix)"</span>,fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb3-66"></span>
<span id="cb3-67">exp_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(exp_sample,legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,title <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Transformed ARMA(1,1) sample"</span>)</span>
<span id="cb3-68"></span>
<span id="cb3-69"></span>
<span id="cb3-70"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(</span>
<span id="cb3-71">    arma_plot,</span>
<span id="cb3-72">    exp_plot,</span>
<span id="cb3-73">    layout <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),</span>
<span id="cb3-74">    size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1200</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">600</span>),</span>
<span id="cb3-75">    fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png</span>
<span id="cb3-76">)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="6">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas_files/figure-html/cell-4-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Clearly, the samples from the Copula model are not Gaussian anymore. In fact, we observe a single draw from a ARMA(1,1) Copula with Exponential-distributed marginals.</p>
</section>
<section id="parameter-estimation-with-maximum-likelihood" class="level2">
<h2 class="anchored" data-anchor-id="parameter-estimation-with-maximum-likelihood">Parameter estimation with Maximum Likelihood</h2>
<p>So far, we have only been able to simulate a time-series from the ARMA(1,1) Copula model. In order to fit the model, we will apply Maximum Likelihood. When using Copulas for cross-sectional data, it is usually possible to separate fitting the marginal distributions from fitting the Copula. Unfortunately, this does not work here.</p>
<p>As we only observe one realization of the process per marginal, fitting a distribution based on the marginals alone is impossible. Rather, we now need to optimize both the marginals and the copula at once. This begs the additional difficulty of having to deal with the marginal’s parameters inside the marginal’s c.d.f..</p>
<p>Namely, our Maximum likelihood objective looks as follows: <img src="https://latex.codecogs.com/png.latex?%0A=%5Cmax%20_%7B%5Cphi,%20%5Ctheta,%20%5Csigma,%20%5Clambda%7D%20%5Clog%20c%5Cleft(F_%5Clambda%5Cleft(y_1%5Cright),%20%5Cldots,%20F_%5Clambda%5Cleft(x_m%5Cright)%20;%20R(%5Cphi,%20%5Ctheta,%20%5Csigma)%5Cright)+%5Csum_%7Bt=1%7D%5ET%20%5Clog%20p_%5Clambda%5Cleft(y_t%5Cright)%0A"> where - <img src="https://latex.codecogs.com/png.latex?R(%5Cphi,%20%5Ctheta,%20%5Csigma):="> Standardized Gaussian Copula covariance matrix (with respect to the ARMA parameters) - <img src="https://latex.codecogs.com/png.latex?F_%5Clambda(%5Ccdot):="> c.d.f. of an Exponential distribution with paramater - <img src="https://latex.codecogs.com/png.latex?%5Clambda"> <img src="https://latex.codecogs.com/png.latex?p_%5Clambda(%5Ccdot):="> probability density of an Exponential distribution with parameter <img src="https://latex.codecogs.com/png.latex?%5Clambda"></p>
<p>Optimizing this can become quite ugly as derivatives with respect to a c.d.f.’s parameters are usually fairly complex. Luckily, the Exponential distribution is quite simple and respective derivatives are easily found. Even better, the <a href="https://julianlsolvers.github.io/Optim.jl/stable/?ref=sarem-seitz.com">Optim.jl</a> package can optimize our log-likelihood via <a href="https://en.wikipedia.org/wiki/Finite_difference?ref=sarem-seitz.com">finite differences</a> without requiring any derivatives at all.</p>
<p>If we chose another distribution than the Exponential, finite differences might not suffice. In that case, we would have to either implement the c.d.f. derivatives by hand or hope that <a href="https://juliadiff.org/ChainRulesCore.jl/stable/?ref=sarem-seitz.com">ChainRules.jl</a> can handle them for us.</p>
<p>Also, we transform our model parameters to the correct domains via exp and tanh instead of applying Box constraints in the Optim optimizer. This worked reasonably accurate and fast here:</p>
<div id="cell-15" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb4-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Optim</span></span>
<span id="cb4-2"></span>
<span id="cb4-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gauss_copula_ll</span>(R,y)</span>
<span id="cb4-4">    n <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">size</span>(R,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)</span>
<span id="cb4-5">    yt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">transpose</span>(y)</span>
<span id="cb4-6">    R_stab <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> R <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(n)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>)</span>
<span id="cb4-7">    </span>
<span id="cb4-8">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">det</span>(R_stab))  <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">yt*</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inv</span>(R)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(n)))<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*transpose</span>(yt))[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb4-9"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb4-10"></span>
<span id="cb4-11"></span>
<span id="cb4-12"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_loss</span>(params)</span>
<span id="cb4-13">    </span>
<span id="cb4-14">    y_uniform <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Exponential</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])),exp_sample)</span>
<span id="cb4-15">    </span>
<span id="cb4-16">    model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ARMA_1_1</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tanh</span>(params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tanh</span>(params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>]),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>]))</span>
<span id="cb4-17">    </span>
<span id="cb4-18">    autocov <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_autocovariance_matrix</span>(model,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(exp_sample))</span>
<span id="cb4-19">    normalized_autocov <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Hermitian</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">normalize_covariance</span>(autocov)))</span>
<span id="cb4-20">    y_normal <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(), y_uniform)</span>
<span id="cb4-21">        </span>
<span id="cb4-22">    loss <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-gauss_copula_ll</span>(normalized_autocov,y_normal) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logpdf</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Exponential</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(params[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])),exp_sample))</span>
<span id="cb4-23">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> loss</span>
<span id="cb4-24"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb4-25"></span>
<span id="cb4-26">res <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">optimize</span>(likelihood_loss,[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>.,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>.,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>.,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">LBFGS</span>())</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="7">
<pre><code> * Status: success (objective increased between iterations)

 * Candidate solution
    Final objective value:     1.180344e+02

 * Found with
    Algorithm:     L-BFGS

 * Convergence measures
    |x - x'|               = 2.73e-10 ≰ 0.0e+00
    |x - x'|/|x'|          = 2.73e-10 ≰ 0.0e+00
    |f(x) - f(x')|         = 1.14e-13 ≰ 0.0e+00
    |f(x) - f(x')|/|f(x')| = 9.63e-16 ≰ 0.0e+00
    |g(x)|                 = 8.21e-09 ≤ 1.0e-08

 * Work counters
    Seconds run:   58  (vs limit Inf)
    Iterations:    13
    f(x) calls:    40
    ∇f(x) calls:   40</code></pre>
</div>
</div>
<p>Now, let us evaluate the result. For the Exponential distribution, the estimated parameter should be close to the true parameter. Regarding the latent ARMA parameters, we primarily need the estimated auto-covariance to be close to ground-truth. This is indeed the case here:</p>
<div id="cell-17" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb6-1">lambda <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(res.minimizer[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])</span>
<span id="cb6-2">phi <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tanh</span>(res.minimizer[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])</span>
<span id="cb6-3">theta <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">tanh</span>(res.minimizer[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>])</span>
<span id="cb6-4">sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(res.minimizer[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>])</span>
<span id="cb6-5"></span>
<span id="cb6-6">estimated_marginal <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Exponential</span>(lambda)</span>
<span id="cb6-7">estimated_arma_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ARMA_1_1</span>(phi,theta,sigma) </span>
<span id="cb6-8"></span>
<span id="cb6-9">true_acf <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">normalize_covariance</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_autocovariance_matrix</span>(arma_model,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>))[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb6-10">model_acf <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">normalize_covariance</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_autocovariance_matrix</span>(estimated_arma_model,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>))[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb6-11"></span>
<span id="cb6-12"></span>
<span id="cb6-13">lambda_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">groupedbar</span>([[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>] [lambda]],labels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"True Exponential Parameter"</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model Exponential Parameter"</span>]</span>
<span id="cb6-14">            ,xlab<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Lag"</span>,title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"True VS. estimated parameter of Exponential distribution"</span>,</span>
<span id="cb6-15">            fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), margin<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>mm)</span>
<span id="cb6-16"></span>
<span id="cb6-17">acf_plot <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">groupedbar</span>([true_acf model_acf],labels<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>[<span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"True ACF"</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Model ACF"</span>],xlab<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Lag"</span>,title<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"True VS. estimated ACF"</span>,</span>
<span id="cb6-18">            fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), margin<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>mm)</span>
<span id="cb6-19"></span>
<span id="cb6-20"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(lambda_plot,acf_plot,layout<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="8">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
</section>
<section id="forecasting-with-the-copula-model" class="level2">
<h2 class="anchored" data-anchor-id="forecasting-with-the-copula-model">Forecasting with the Copula model</h2>
<p>Finally, we want to use our model to produce actual forecasts. Due to the Copula construction, we can derive the conditional forecast density in closed form. As we will see however, mean and quantile forecasts need to be calculated numerically.</p>
<p>First, recall how the Copula model defines a joint density over all ‘training’-observations: <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(y_1,%20%5Cldots,%20y_T%5Cright)=c%5Cleft(F%5Cleft(y_1%5Cright),%20%5Cldots,%20F%5Cleft(y_T%5Cright)%5Cright)%20%5Ccdot%20p%5Cleft(y_1%5Cright)%20%5Ccdots%20p%5Cleft(y_T%5Cright)%0A"> In order to forecast a conditional density at h steps ahead, we simply need to follow standard probability laws: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap%5Cleft(y_%7BT+h%7D%20%5Cmid%20y_1,%20%5Cldots,%20y_T%5Cright)%20%5C%5C%0A=%5Cfrac%7Bp%5Cleft(y_%7BT+h%7D,%20y_1,%20%5Cldots,%20y_T%5Cright)%7D%7Bp%5Cleft(y_1,%20%5Cldots,%20y_T%5Cright)%7D%20%5C%5C%0A=%5Cfrac%7Bc%5Cleft(F%5Cleft(y_%7BT+h%7D%5Cright),%20F%5Cleft(y_1%5Cright),%20%5Cldots,%20F%5Cleft(y_T%5Cright)%5Cright)%20%5Ccdot%20p%5Cleft(y_%7BT+h%7D%5Cright)%20%5Ccdot%20p%5Cleft(y_1%5Cright)%20%5Ccdots%20p%5Cleft(y_T%5Cright)%7D%7Bc%5Cleft(F%5Cleft(y_1%5Cright),%20%5Cldots,%20F%5Cleft(y_T%5Cright)%5Cright)%20%5Ccdot%20p%5Cleft(y_1%5Cright)%20%5Ccdots%20p%5Cleft(y_T%5Cright)%7D%20%5C%5C%0A=%5Cfrac%7Bc%5Cleft(F%5Cleft(y_%7BT+h%7D%5Cright),%20F%5Cleft(y_1%5Cright),%20%5Cldots,%20F%5Cleft(y_T%5Cright)%5Cright)%7D%7Bc%5Cleft(F%5Cleft(y_1%5Cright),%20%5Cldots,%20F%5Cleft(y_T%5Cright)%5Cright)%7D%20%5Ccdot%20p%5Cleft(y_%7BT+h%7D%5Cright)%0A%5Cend%7Bgathered%7D%0A"> This boils down to the ratio of two Copula evaluations times the marginal density evaluated at the target point. However, we still need to find a way to use this equation to calculate a mean forecast and a forecast interval.</p>
<p>As the density is arguably fairly complex, we won’t even try to derive any of these values in closed form. Rather, we use numerical methods to find the target quantities.</p>
<p>For the mean, we simply use quadrature to approximate the usual integral <img src="https://latex.codecogs.com/png.latex?%0A%5Cmathbb%7BE%7D%5Cleft%5By_%7BT+h%7D%20%5Cmid%20y_1,%20%5Cldots,%20y_T%5Cright%5D%20%5Capprox%20%5Cint_0%5EU%20y_%7BT+h%7D%20%5Ccdot%20p%5Cleft(y_%7BT+h%7D%20%5Cmid%20y_1,%20%5Cldots,%20y_T%5Cright)%20d%20y_%7BT+h%7D%0A"> with U a sufficiently large value to capture most of the probability mass (approximation up to infinity is obviously not possible).</p>
<p>For the forecast interval, we use the 90% prediction interval. Thus, we need to find the 5% and the 95% quantiles of the conditional density. This can be done via another approximation, this time through an Ordinary Differential Equation: <img src="https://latex.codecogs.com/png.latex?%0A%5Cfrac%7Bd%20F%5E%7B-1%7D%7D%7Bd%20u%7D=%5Cfrac%7B1%7D%7Bp%5Cleft(F%5E%7B-1%7D(u)%20%5Cmid%20y_1,%20%5Cldots,%20y_T%5Cright)%7D%0A"> with <img src="https://latex.codecogs.com/png.latex?F%5E%7B-1%7D(u)"> the quantile function corresponding to <img src="https://latex.codecogs.com/png.latex?p%5Cleft(y_%7BT+h%7D%20%5Cmid%20y_1,%20%5Cldots,%20y_T%5Cright)"> evaluated at <img src="https://latex.codecogs.com/png.latex?u%20%5Cin%5B0.0,1.0%5D"></p>
<p>For a derivation of this formula, see, for example, <a href="https://blogs.sas.com/content/iml/2022/04/06/differential-equation-quantiles.html?ref=sarem-seitz.com#:~:text=Appendix%3A%20Derive%20the%20quantile%20ODE">here</a>. Integrating the ODE from zero up to the target quantile yields the respective target quantile value. The latter can be done numerically via <a href="https://diffeq.sciml.ai/stable/tutorials/ode_example/?ref=sarem-seitz.com">DifferentialEquations.jl</a>.</p>
<p>With this, we can finally calculate the forecast and plot the result:</p>
<div id="cell-19" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb7-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">QuadGK</span></span>
<span id="cb7-2"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">DifferentialEquations</span></span>
<span id="cb7-3"></span>
<span id="cb7-4"></span>
<span id="cb7-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#precompute autocovariance matrix to save some computation time</span></span>
<span id="cb7-6">T <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span></span>
<span id="cb7-7">autocovariance <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">construct_autocovariance_matrix</span>(estimated_arma_model, T)</span>
<span id="cb7-8">normalized_autocov <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">normalize_covariance</span>(autocovariance) </span>
<span id="cb7-9"></span>
<span id="cb7-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#this yields the conditional density for any ARMA, any Exponential marginal and at any 'h' in the future</span></span>
<span id="cb7-11"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">evaluate_conditional_density_forecast</span>(x, model<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">ARMA_1_1</span>, marginal<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Distributions.Exponential</span>, y, t_forecast<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-12">    T_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(y)</span>
<span id="cb7-13"></span>
<span id="cb7-14">    target_cov <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> normalized_autocov[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T_train),T_train<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>t_forecast),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T_train),T_train<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span>t_forecast)]</span>
<span id="cb7-15">    </span>
<span id="cb7-16">    y_normal <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cdf</span>.(marginal,y))</span>
<span id="cb7-17">    x_normal <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>(),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">cdf</span>(marginal,x))</span>
<span id="cb7-18">    </span>
<span id="cb7-19">    copula_density_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gauss_copula_ll</span>(target_cov[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T_train,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>T_train],y_normal))</span>
<span id="cb7-20">    copula_density_full <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gauss_copula_ll</span>(target_cov,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(y_normal,x_normal)))</span>
<span id="cb7-21">    marginal_density <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">pdf</span>(marginal,x)</span>
<span id="cb7-22">    </span>
<span id="cb7-23">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> marginal_density <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> copula_density_full<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>copula_density_train     </span>
<span id="cb7-24"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb7-25"></span>
<span id="cb7-26"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#conditional density at forecast period 't'</span></span>
<span id="cb7-27"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">p</span>(x,t) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">evaluate_conditional_density_forecast</span>(x,estimated_arma_model,estimated_marginal,exp_sample,t)</span>
<span id="cb7-28"></span>
<span id="cb7-29"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#mean forecast uses Quadrature to approximate the intractable 'mean'-integral</span></span>
<span id="cb7-30">mean_forecast <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quadgk</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">x-&gt;p</span>(x,t)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span>x, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>, <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(estimated_marginal, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>), rtol<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-4</span>)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>] for t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>]</span>
<span id="cb7-31"></span>
<span id="cb7-32"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#quantile forecast via differential equation: </span></span>
<span id="cb7-33"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#homepages.ucl.ac.uk/~ucahwts/lgsnotes/EJAM_Quantiles.pdf</span></span>
<span id="cb7-34"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">approximate_quantile</span>(q, t<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb7-35"></span>
<span id="cb7-36">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">target_density</span>(x) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">p</span>(x,t)</span>
<span id="cb7-37">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">diffeq</span>(u,p,t) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">target_density</span>(u)</span>
<span id="cb7-38">    </span>
<span id="cb7-39">    u0<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span></span>
<span id="cb7-40">    tspan<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.0</span>,q)</span>
<span id="cb7-41"></span>
<span id="cb7-42">    prob <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ODEProblem</span>(diffeq,u0,tspan)</span>
<span id="cb7-43">    sol <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">solve</span>(prob,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Tsit5</span>(),reltol<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-4</span>,abstol<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-4</span>)</span>
<span id="cb7-44">    </span>
<span id="cb7-45">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> sol.u[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>]</span>
<span id="cb7-46"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb7-47"></span>
<span id="cb7-48"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#10% prediction/forecast interval</span></span>
<span id="cb7-49">lower_05 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">approximate_quantile</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>,t) for t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>]</span>
<span id="cb7-50">upper_95 <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">approximate_quantile</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>,t) for t <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">20</span>]</span>
<span id="cb7-51"></span>
<span id="cb7-52"></span>
<span id="cb7-53"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#plot the final result</span></span>
<span id="cb7-54">ribbon_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(exp_sample[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],mean_forecast) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(exp_sample[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],lower_05)</span>
<span id="cb7-55">ribbon_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>  <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(exp_sample[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],upper_95) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(exp_sample[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],mean_forecast) </span>
<span id="cb7-56"></span>
<span id="cb7-57"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>)[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">49</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],exp_sample[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">49</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>),label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Last 50 observations from TS"</span>)</span>
<span id="cb7-58"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">520</span>),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(exp_sample[<span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>],mean_forecast),ribbon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(ribbon_lower,ribbon_upper),fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Forecast plus interval"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="11">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas_files/figure-html/cell-7-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>This looks indeed quite reasonable and the forecast appears to converge to a stable distribution as we predict further ahead into the future.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>As we have seen, Copulas make it possible to extend well-known models to non-Gaussian data. This allowed us to transfer the simplicity of the ARMA model to Exponential marginals that were only defined for positive values.</p>
<p>One complication arises when the size of the observed time-series becomes very large. In that case, the unconditional covariance matrix will scale poorly and the model fitting step will likely become impossible.</p>
<p>Then, we need to find a computationally more efficient solution. One possible approach are <a href="https://arxiv.org/pdf/2109.04718.pdf?ref=sarem-seitz.com">Implicit Copulas</a> which define a Copula density through a chain of conditional densities.</p>
<p>Of course, there are many other ways to integrate Copulas into classical statistical and Machine Learning models. For the latter, research is still a little sparse. However, I strongly believe that there is at least some potential for a modern application of these classic statistical objects.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Hamilton, James Douglas. Time series analysis. Princeton university press, 2020.</p>
<p><strong>[2]</strong> Nelsen, Roger B. An introduction to copulas. Springer Science &amp; Business Media, 2007.</p>
<p><strong>[3]</strong> Smith, Michael Stanley. Implicit copulas: An overview. Econometrics and Statistics, 2021.</p>


</section>

 ]]></description>
  <category>Time Series</category>
  <guid>https://www.sarem-seitz.com/posts/arma-forecasting-for-non-gaussian-time-series-data-using-copulas.html</guid>
  <pubDate>Fri, 17 Jun 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>Bayesian Machine Learning and Julia are a match made in heaven</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/bayesian-machine-learning-and-julia-are-a-match-made-in-heaven.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>As I argued in an <a href="https://sarem-seitz.com/blog/when-is-bayesian-machine-learning-actually-useful/?ref=sarem-seitz.com">earlier article</a>, Bayesian Machine Learning can be quite powerful. Building actual Bayesian models in Python, however, is sometimes a bit of a hassle. Most solutions that you will find online are either <a href="https://keras.io/examples/keras_recipes/bayesian_neural_networks/?ref=sarem-seitz.com">relatively complex</a> or require learning yet <a href="https://towardsdatascience.com/blitz-a-bayesian-neural-network-library-for-pytorch-82f9998916c7?ref=sarem-seitz.com">another domain specific language</a>. The latter could easily constrain your expressiveness when you need a highly customized solution.</p>
<p>Doing Bayesian Machine Learning in Julia, on the other hand, allows you to mitigate both these issues. In fact, you just need a few lines of raw Julia code to build, for example, a <a href="https://arxiv.org/pdf/2007.06823.pdf?ref=sarem-seitz.com">Bayesian Neural Network</a> for regression. Julia’s <a href="https://fluxml.ai/Flux.jl/stable/?ref=sarem-seitz.com">Flux</a> and <a href="https://turing.ml/stable/?ref=sarem-seitz.com">Turing</a> packages will then handle the heavy workload under the hood.</p>
<p>Hence today, I want to show you how to implement and train a Bayesian Neural Network in less than 30 lines of Julia. Before showing you the code, let us briefly recall the main theoretical aspects:</p>
</section>
<section id="bayesian-machine-learning-in-three-steps" class="level2">
<h2 class="anchored" data-anchor-id="bayesian-machine-learning-in-three-steps">Bayesian Machine Learning in three steps</h2>
<p>As always, we want to find a posterior distribution via Bayes’ law: <img src="https://latex.codecogs.com/png.latex?%0Ap(%5Ctheta%20%5Cmid%20D)=%5Cfrac%7Bp(D%20%5Cmid%20%5Ctheta)%20p(%5Ctheta)%7D%7Bp(D)%7D%0A"> As the data term in the denominator is a constant, we can simplify the above: <img src="https://latex.codecogs.com/png.latex?%0Ap(%5Ctheta%20%5Cmid%20D)%20%5Cpropto%20p(D%20%5Cmid%20%5Ctheta)%20p(%5Ctheta)%0A"> To avoid confusion, let us use the following standard wording: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgathered%7D%0Ap(%5Ctheta):=%5Ctext%20%7B%20'prior%20distribution'%20%7D%20%5C%5C%0Ap(D%20%5Cmid%20%5Ctheta):=%5Ctext%20%7B%20'likelihood%20function'%20%7D%0A%5Cend%7Bgathered%7D%0A"> For Bayesian Neural Network regression, we further specify the likelihood function: <img src="https://latex.codecogs.com/png.latex?%0Ap(D%20%5Cmid%20%5Ctheta)=%5Cprod_%7Bi=1%7D%5EN%20%5Cmathcal%7BN%7D%5Cleft(y_i%20%5Cmid%20f_W%5Cleft(X_i%5Cright),%20%5Csigma%5E2%5Cright)%0A"> This denotes a product of independent normal distributions with means defined by the outputs of a Neural Network. The variance of the Normal distribution is chosen to be a constant.</p>
<p>The corresponding prior distribution could look as follows: <img src="https://latex.codecogs.com/png.latex?%0Ap(%5Ctheta)=p(W,%20%5Csigma)=%5Cprod_%7Bk=1%7D%5EK%20%5Cmathcal%7BN%7D%5Cleft(W_k%20%5Cmid%200,1%5Cright)%20%5Ccdot%20%5CGamma(1,1)%0A"> The priors for <img src="https://latex.codecogs.com/png.latex?K"> network weights are independent standard normal distributions. For the square root of the variance (a.k.a. standard deviation), we use a standard Gamma distribution. So, from a theory perspective, we are all set up and ready to go.</p>
<p>Ideally, we now want to implement the Bayesian Neural Network in the following steps:</p>
<ol type="1">
<li><strong>Define the likelihood function</strong></li>
<li><strong>Define the prior distribution</strong></li>
<li><strong>Train the model</strong></li>
</ol>
<p>Having these three steps separate from each other in the code, will help us to</p>
<ul>
<li><strong>Maintain readability</strong> - Besides the corresponding functions being smaller, a potential reader can also easier discern the likelihood from the prior.</li>
<li><strong>Keep the code testable at a granular level</strong> - Likelihood and prior distribution are clearly separate concerns. Thus, we should also be able to test them individually.</li>
</ul>
<p>With this in mind, let us start building the model in Julia.</p>
</section>
<section id="defining-the-likelihood-function" class="level2">
<h2 class="anchored" data-anchor-id="defining-the-likelihood-function">Defining the likelihood function</h2>
<p>The <code>Flux</code> library provides everything we need to build and work with Neural Networks. It has <code>Dense</code> to build feedforward layers and <code>Chain</code> to combine the layers into a network.</p>
<p>Our <code>Likelihood</code> struct therefore consists of the Neural Network, <code>network</code>, and the standard deviation, <code>sigma</code>. In the feedforward pass, we use the network’s output and <code>sigma</code> to define conditional mean and standard deviation of the Gaussian likelihood:</p>
<div id="d78dce36-afb0-4ed8-bbdc-d098c16b1af3" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Flux</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Distributions</span></span>
<span id="cb1-2"></span>
<span id="cb1-3"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">struct</span> Likelihood</span>
<span id="cb1-4">    network</span>
<span id="cb1-5">    sigma</span>
<span id="cb1-6"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb1-7">Flux.<span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@functor</span> Likelihood <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#tell Flux to look for trainable parameters in Likelihood</span></span>
<span id="cb1-8"></span>
<span id="cb1-9">(p<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Likelihood</span>)(x) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Normal</span>.(p.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">network</span>(x)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>], p.sigma[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]); <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Flux only recognizes Matrix parameters but Normal() needs a scalar for sigma</span></span></code></pre></div>
</div>
<p>The dot in <code>Normal.(...)</code> lets us define one Normal distribution per network output, each with standard deviation <code>sigma</code>. We could combine this with <code>logpdf(...)</code> from the <a href="https://juliastats.org/Distributions.jl/stable/?ref=sarem-seitz.com">Distributions</a> library in order to train the model with maximum likelihood gradient descent. To perform Bayesian Machine Learning, however, we need to add a few more elements.</p>
<p>This leads us to the central function of this article, namely <code>Flux.destructure()</code>. From the documentation:</p>
<div id="1273e2a3-67a2-4d86-b8fb-69aa78395d9b" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb2-1"><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@doc</span> Flux.destructure</span></code></pre></div>
<div class="cell-output cell-output-display cell-output-markdown" data-execution_count="2">
<pre><code>destructure(model) -&gt; vector, reconstructor</code></pre>
<p>Copies all <a href="@ref"><code>trainable</code></a>, <a href="@ref"><code>isnumeric</code></a> parameters in the model to a vector, and returns also a function which reverses this transformation. Differentiable.</p>
<section id="example" class="level1">
<h1>Example</h1>
<pre class="jldoctest"><code>julia&gt; v, re = destructure((x=[1.0, 2.0], y=(sin, [3.0 + 4.0im])))
(ComplexF64[1.0 + 0.0im, 2.0 + 0.0im, 3.0 + 4.0im], Restructure(NamedTuple, ..., 3))

julia&gt; re([3, 5, 7+11im])
(x = [3.0, 5.0], y = (sin, ComplexF64[7.0 + 11.0im]))</code></pre>
<p>If <code>model</code> contains various number types, they are promoted to make <code>vector</code>, and are usually restored by <code>Restructure</code>. Such restoration follows the rules of <code>ChainRulesCore.ProjectTo</code>, and thus will restore floating point precision, but will permit more exotic numbers like <code>ForwardDiff.Dual</code>.</p>
<p>If <code>model</code> contains only GPU arrays, then <code>vector</code> will also live on the GPU. At present, a mixture of GPU and ordinary CPU arrays is undefined behaviour.</p>
</section>
</div>
</div>
<p>In summary, <code>destructure(...)</code> takes an instantiated model struct and returns a tuple with two elements:</p>
<ol type="1">
<li>The <strong>model parameters</strong>, concatenated into a single vector</li>
<li>A <strong>reconstructor function</strong> that takes a parameter vector as in 1. as input and returns the model with those parameters</li>
</ol>
<p>The latter is important as we can feed an arbitrary parameter vector to the reconstructor. As long as its length is valid, it returns the corresponding model with the given parameter configuration. In code:</p>
<div id="6f6b156b-7db7-4768-9fa7-b8f2801a551d" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb5-1">likelihood <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Likelihood</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Chain</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Dense</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>,tanh),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Dense</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb5-2"></span>
<span id="cb5-3">params, likelihood_reconstructor <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">destructure</span>(likelihood)</span>
<span id="cb5-4">n_weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(params) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span></span>
<span id="cb5-5"></span>
<span id="cb5-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(weights, sigma) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_reconstructor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(weights<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>,sigma));</span></code></pre></div>
</div>
<p>The last function will allow us to provide weights and standard deviation parameters separately to the reconstructor. This is a necessary step in order for <code>Turing</code> to handle the Bayesian inference part.</p>
<p>From here, we are ready to move to the prior distribution.</p>
</section>
<section id="defining-the-prior-distribution" class="level2">
<h2 class="anchored" data-anchor-id="defining-the-prior-distribution">Defining the prior distribution</h2>
<p>This part is very short - we only need to define the prior distributions for the weight vector and the standard deviation scalar:</p>
<div id="d23027ff-e7e6-4405-a41a-a25ddc9ef4fb" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb6-1">weight_prior <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MvNormal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(n_weights), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(n_weights))</span>
<span id="cb6-2">sigma_prior <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Gamma</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>.,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>.);</span></code></pre></div>
</div>
<p>Having defined both likelihood and prior, we can take samples from the <strong>prior predictive distribution</strong>,</p>
<p><img src="https://latex.codecogs.com/png.latex?%0Ap(y%20%5Cmid%20X)=%5Cprod_%7Bi=1%7D%5EN%20%5Cint%20%5Cmathcal%7BN%7D%5Cleft(y_i%20%5Cmid%20f_W%5Cleft(X_i%5Cright),%20%5Csigma%5E2%5Cright)%20p(W)%20p(%5Csigma)%20d%20W%20d%20%5Csigma%0A"></p>
<p>While this might look complicated as a formula, we are basically just drawing Monte Carlo samples. The prior predictive distribution itself includes the noise from sigma. Prior predictive draws from the network alone, i.e.&nbsp;the prior predictive mean, yield nice and smooth samples:</p>
<div id="fcd21a61-2954-4001-8574-79d7d4bb1fd4" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb7-1">Xline <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">transpose</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]))</span>
<span id="cb7-2"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(weight_prior), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(sigma_prior))(Xline)</span>
<span id="cb7-3"></span>
<span id="cb7-4"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Plots</span></span>
<span id="cb7-5"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">54321</span>)</span>
<span id="cb7-6"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(weight_prior), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(sigma_prior))(Xline)),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red, legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb7-7"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(weight_prior), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(sigma_prior))(Xline)),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red)</span>
<span id="cb7-8"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(weight_prior), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(sigma_prior))(Xline)),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red)</span>
<span id="cb7-9"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(weight_prior), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(sigma_prior))(Xline)),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red)</span>
<span id="cb7-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(weight_prior), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(sigma_prior))(Xline)),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="5">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/bayesian-machine-learning-and-julia-are-a-match-made-in-heaven_files/figure-html/cell-6-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Now, we can actually train the model.</p>
</section>
<section id="training-the-bayesian-neural-network" class="level2">
<h2 class="anchored" data-anchor-id="training-the-bayesian-neural-network">Training the Bayesian Neural Network</h2>
<p>For this example, we’ll be using synthetic data, sampled from <img src="https://latex.codecogs.com/png.latex?%0Ap(y,%20X)=%5Cprod_%7Bi=1%7D%5E%7B50%7D%20%5Cmathcal%7BN%7D%5Cleft(y_i%20%5Cmid%20%5Csin%20%5Cleft(X_i%5Cright),%200.25%5E2%5Cright)%20%5Ccdot%20%5Cmathcal%7BU%7D%5Cleft(X_i%20%5Cmid-2,2%5Cright)%0A"> The latter factor denotes a uniform density over <img src="https://latex.codecogs.com/png.latex?(-2,2)">.</p>
<div id="e3fbc789-c0d4-44fb-ac84-864c24f34edc" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb8-1"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">54321</span>)</span>
<span id="cb8-2">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb8-3">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sin</span>.(X) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">50</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.25</span></span>
<span id="cb8-4"></span>
<span id="cb8-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter</span>(X[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>], y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>green,legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="6">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/bayesian-machine-learning-and-julia-are-a-match-made-in-heaven_files/figure-html/cell-7-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>In order to use <code>Turing</code>, we need to define a model as <a href="https://turing.ml/dev/docs/using-turing/quick-start?ref=sarem-seitz.com">explained in their documentation</a>. Applied on our example, we get the following:</p>
<div id="492db7b3-7e53-40b8-be91-5939dd4410f1" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb9" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb9-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Turing</span></span>
<span id="cb9-2"></span>
<span id="cb9-3"><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@model</span> <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">TuringModel</span>(likelihood_conditional, weight_prior, sigma_prior, X, y)</span>
<span id="cb9-4">    weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> weight_prior</span>
<span id="cb9-5">    sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> sigma_prior</span>
<span id="cb9-6"></span>
<span id="cb9-7">    predictions <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_conditional</span>(weights,sigma)(X)</span>
<span id="cb9-8">    </span>
<span id="cb9-9">    y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>] <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">~</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Product</span>(predictions)</span>
<span id="cb9-10"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>;</span></code></pre></div>
</div>
<p>Finally, we need to choose an algorithm for Bayesian posterior inference. As our model is comparatively small, <a href="https://en.wikipedia.org/wiki/Hamiltonian_Monte_Carlo?ref=sarem-seitz.com">Hamiltonian Monte Carlo</a> (HMC) is a suitable choice. In fact, HMC is generally considered the gold standard algorithm for Bayesian Machine Learning. Unfortunately, it becomes quite inefficient in high dimensions.</p>
<p>Nevertheless, we now use HMC via Turing and collect the resulting draws from the MCMC posterior:</p>
<div id="57087cab-b528-464a-be6e-b1a2340c24bc" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb10-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span></span>
<span id="cb10-2"></span>
<span id="cb10-3"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">54321</span>)</span>
<span id="cb10-4">N <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5000</span></span>
<span id="cb10-5">ch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">TuringModel</span>(likelihood_conditional, weight_prior, sigma_prior, X , y), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">HMC</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>), N);</span>
<span id="cb10-6"></span>
<span id="cb10-7"></span>
<span id="cb10-8">weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Array</span>(MCMCChains.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group</span>(ch, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>weights).value) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get posterior MCMC samples for network weights</span></span>
<span id="cb10-9">sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Array</span>(MCMCChains.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group</span>(ch, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>sigma).value); <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get posterior MCMC samples for standard deviation</span></span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>Sampling: 100%|█████████████████████████████████████████| Time: 0:00:05</code></pre>
</div>
</div>
<p>From here, we can visualize the full posterior predictive distribution, <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(y%5E*%20%5Cmid%20X%5E*,%20X,%20y%5Cright)=%5Cprod_%7Bi=1%7D%5E%7BN%5E*%7D%20%5Cmathcal%7BN%7D%5Cleft(y_i%5E*%20%5Cmid%20f_W%5Cleft(X_i%5E*%5Cright),%20%5Csigma%5E2%5Cright)%20%5Ccdot%20p(W,%20%5Csigma%20%5Cmid%20X,%20y)%20d%20W%20d%20%5Csigma%0A"> This is done in a similar fashion as for the prior predictive distribution (star-variables denote new inputs outside the training set). The only difference is that we now use the samples from the MCMC posterior distribution.</p>
<div id="d1f36430-ea67-46e3-a71e-d836ee7a1295" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb12-1"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">54321</span>)</span>
<span id="cb12-2"></span>
<span id="cb12-3">posterior_predictive_mean_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb12-4">posterior_predictive_full_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb12-5"></span>
<span id="cb12-6"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span></span>
<span id="cb12-7">    samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb12-8">    W <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> weights[samp,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb12-9">    sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas[samp,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb12-10">    posterior_predictive_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_reconstructor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(W[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],sigma[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]))</span>
<span id="cb12-11">    </span>
<span id="cb12-12">    predictive_distribution <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">posterior_predictive_model</span>(Xline)</span>
<span id="cb12-13">    postpred_full_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Product</span>(predictive_distribution))</span>
<span id="cb12-14">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">push!</span>(posterior_predictive_mean_samples,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(predictive_distribution))</span>
<span id="cb12-15">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">push!</span>(posterior_predictive_full_samples, postpred_full_sample)</span>
<span id="cb12-16"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb12-17"></span>
<span id="cb12-18">posterior_predictive_mean_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(posterior_predictive_mean_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb12-19"></span>
<span id="cb12-20">pp_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(posterior_predictive_mean_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb12-21">pp_mean_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>),posterior_predictive_mean_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb12-22">pp_mean_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>),posterior_predictive_mean_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb12-23"></span>
<span id="cb12-24">posterior_predictive_full_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(posterior_predictive_full_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb12-25">pp_full_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>),posterior_predictive_full_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb12-26">pp_full_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>),posterior_predictive_full_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb12-27"></span>
<span id="cb12-28"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],pp_mean, ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (pp_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_full_lower, pp_full_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_mean),legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>bottomright, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Full posterior predictive distribution"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb12-29"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>], pp_mean, ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (pp_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_mean_lower, pp_mean_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_mean), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Posterior predictive mean distribution (a.k.a. epistemic uncertainty)"</span>)</span>
<span id="cb12-30"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter!</span>(X[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>green, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="9">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/bayesian-machine-learning-and-julia-are-a-match-made-in-heaven_files/figure-html/cell-10-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>Using the above example, it is easy to try out other prior distributions.</p>
</section>
<section id="plug-and-play-with-different-prior-distributions" class="level2">
<h2 class="anchored" data-anchor-id="plug-and-play-with-different-prior-distributions">Plug-and-play with different prior distributions</h2>
<p>As another big advantage, <code>Turing</code> can use almost all distributions from the <code>Distributions</code> library as a prior. This also allows us to try out some exotic weight priors, say a <a href="https://juliastats.org/Distributions.jl/latest/univariate/?ref=sarem-seitz.com#Distributions.Semicircle">Semicircle distribution</a> with radius 0.5. All we have to do is replace the Gaussian prior:</p>
<div id="ef23d3ff-012e-44dd-88bf-b282c16198c3" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb13" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb13-1">weight_prior <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Product</span>([<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Semicircle</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span>) for _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>n_weights]);</span></code></pre></div>
</div>
<p>With the same setup as before, we get the following posterior predictive distribution:</p>
<div id="7e60333d-83e5-4913-bbfc-cf8651664df0" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb14" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb14-1"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">54321</span>)</span>
<span id="cb14-2">N <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5000</span></span>
<span id="cb14-3">ch <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sample</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">TuringModel</span>(likelihood_conditional, weight_prior, sigma_prior, X , y), <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">HMC</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.025</span>, <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>), N);</span>
<span id="cb14-4"></span>
<span id="cb14-5"></span>
<span id="cb14-6">weights <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MCMCChains.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group</span>(ch, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>weights).value <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get posterior MCMC samples for network weights</span></span>
<span id="cb14-7">sigmas <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> MCMCChains.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">group</span>(ch, <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>sigma).value <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#get posterior MCMC samples for standard deviation</span></span>
<span id="cb14-8"></span>
<span id="cb14-9"></span>
<span id="cb14-10">posterior_predictive_mean_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb14-11">posterior_predictive_full_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> []</span>
<span id="cb14-12"></span>
<span id="cb14-13"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">10000</span></span>
<span id="cb14-14">    samp <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">5000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)</span>
<span id="cb14-15">    W <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> weights[samp,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb14-16">    sigma <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> sigmas[samp,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]</span>
<span id="cb14-17">    posterior_predictive_model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">likelihood_reconstructor</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">vcat</span>(W[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],sigma[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]))</span>
<span id="cb14-18">    </span>
<span id="cb14-19">    predictive_distribution <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">posterior_predictive_model</span>(Xline)</span>
<span id="cb14-20">    postpred_full_sample <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Product</span>(predictive_distribution))</span>
<span id="cb14-21">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">push!</span>(posterior_predictive_mean_samples,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>.(predictive_distribution))</span>
<span id="cb14-22">    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">push!</span>(posterior_predictive_full_samples, postpred_full_sample)</span>
<span id="cb14-23"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb14-24">    </span>
<span id="cb14-25">posterior_predictive_mean_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(posterior_predictive_mean_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb14-26">pp_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mean</span>(posterior_predictive_mean_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb14-27">pp_mean_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>),posterior_predictive_mean_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb14-28">pp_mean_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>),posterior_predictive_mean_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb14-29"></span>
<span id="cb14-30"></span>
<span id="cb14-31"></span>
<span id="cb14-32">posterior_predictive_full_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(posterior_predictive_full_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb14-33">pp_full_lower <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.05</span>),posterior_predictive_full_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb14-34">pp_full_upper <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">mapslices</span>(x <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-&gt;</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">quantile</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.95</span>),posterior_predictive_full_samples, dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb14-35"></span>
<span id="cb14-36"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],pp_mean, ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (pp_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_full_lower, pp_full_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_mean),legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>bottomright, label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Full posterior predictive distribution"</span>, fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span>
<span id="cb14-37"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>], pp_mean, ribbon <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> (pp_mean<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_mean_lower, pp_mean_upper<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>pp_mean), label<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Posterior predictive mean distribution (a.k.a. epistemic uncertainty)"</span>)</span>
<span id="cb14-38"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter!</span>(X[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>green, label <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="st" style="color: #20794D;
background-color: null;
font-style: inherit;">"Training data"</span>)</span></code></pre></div>
<div class="cell-output cell-output-stderr">
<pre><code>Sampling: 100%|█████████████████████████████████████████| Time: 0:00:05</code></pre>
</div>
<div class="cell-output cell-output-display" data-execution_count="11">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/bayesian-machine-learning-and-julia-are-a-match-made-in-heaven_files/figure-html/cell-12-output-2.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>The possibilities obviously don’t stop here. Another fruitful adjustment could be the introduction of <a href="https://en.wikipedia.org/wiki/Hyperprior?ref=sarem-seitz.com">hyperprior distributions</a>, e.g.&nbsp;over the standard deviation of the weight priors.</p>
<p>Using a model different from a Neural Network is also quite simple. You only need to adjust the Likelihood struct and corresponding functions.</p>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>This article was a brief introduction to Bayesian Machine Learning with Julia. As matter of fact, <a href="Julia is not just fast">Julia is not just fast</a> but can also make coding much easier and more efficient.</p>
<p>While Julia is still a young language with some caveats to deploying Julia programs to production, it is definitely an awesome language for research and prototyping. Especially the seamless interoperability between different libraries can considerably shorten iteration cycles both inside and outside of academia.</p>
<p>At this point, I am also quite hopeful that we will, at some point in the near future, see more Julia being used in industry-grade production code.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Bezanson, Jeff, et al.&nbsp;Julia: A fast dynamic language for technical computing. arXiv preprint arXiv:1209.5145, 2012.</p>
<p><strong>[2]</strong> Ge, Hong; Xu, Kai; Ghahramani, Zoubin. Turing: a language for flexible probabilistic inference. In: International conference on artificial intelligence and statistics. PMLR, 2018. p.&nbsp;1682-1690.</p>


</section>

 ]]></description>
  <category>Bayesian</category>
  <category>Julia</category>
  <guid>https://www.sarem-seitz.com/posts/bayesian-machine-learning-and-julia-are-a-match-made-in-heaven.html</guid>
  <pubDate>Tue, 08 Mar 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>When is Bayesian Machine Learning actually useful?</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/when-is-bayesian-machine-learning-actually-useful.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>When it comes to Bayesian Machine Learning, you likely either love it or prefer to stay at a safe distance from anything Bayesian. Given that current state-of-the-art models hardly ever mention Bayes at all, there is probably at least some rationale behind the latter.</p>
<p>On the other hand, many high profile research groups are have been working on Bayesian Machine Learning tools for decades. And they still do.</p>
<p>Thus, it is rather unlikely that this field is complete quackery either. As often in life, the truth lies probably somewhere between the two extremes. In this article, I want to share my view on the titled question and point out when Bayesian Machine Learning can be helpful.</p>
<p>To avoid confusion, let us briefly define what Bayesian Machine Learning means in the context of this article:</p>
<blockquote class="blockquote">
<p><em>“The Bayesian framework for machine learning states that you start out by enumerating all reasonable models of the data and assigning your prior belief P(M) to each of these models. Then, upon observing the data D, you evaluate how probable the data was under each of these models to compute P(D|M).”</em> - <a href="http://mlg.eng.cam.ac.uk/zoubin/bayesian.html?ref=sarem-seitz.com#:~:text=The%20Bayesian%20framework%20for%20machine,P(D%7CM)."><strong>Zoubin Ghahramani</strong></a></p>
</blockquote>
<p>Less formally, we apply tools and frameworks from <a href="https://en.wikipedia.org/wiki/Bayesian_statistics?ref=sarem-seitz.com">Bayesian statistics</a> to Machine Learning models. I will provide some references at the end, in case you are not familiar with Bayesian statistics yet.</p>
<p>Also, our discussion will essentially equate Machine Learning with neural networks and differentiable models in general (particularly Linear Regression). Since Deep Learning is currently the cornerstone of modern Machine Learning, this appears to be a fair approach.</p>
<p>As a final disclaimer, we will differentiate between frequentist and Bayesian Machine Learning. The former includes the standard ML methods and loss functions that you are probably already familiar with.</p>
<p>Finally, the fact that ‘Bayesian’ is written in uppercase but ‘frequentist’ is not has no judgemental meaning.</p>
<p>Now, let us begin with a surprising result:</p>
</section>
<section id="your-frequentist-model-is-probably-already-bayesian" class="level2">
<h2 class="anchored" data-anchor-id="your-frequentist-model-is-probably-already-bayesian">Your frequentist model is probably already Bayesian</h2>
<p>This statement might sound surprising and odd. However, there is a neat connection between the Bayesian and frequentist learning paradigm.</p>
<p>Let’s start with Bayes’ formula <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Balign%7D%0Ap(%5Ctheta%20%5Cmid%20D)=%5Cfrac%7Bp(D%20%5Cmid%20%5Ctheta)%20p(%5Ctheta)%7D%7Bp(D)%7D%5Clabel%7Beq1%7D%5Ctag%7B1%7D%0A%5Cend%7Balign%7D%0A"> and apply it to a Bayesian Neural Network: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Balign%7D%0A&amp;%5Ctheta:%20%5Ctext%7BNetwork%20weights%7D%5C%5C%0A&amp;D%20:%20%5Ctext%7BDataset%7D%5C%5C%0A&amp;p(%5Ctheta)%20:%20%5Ctext%7BPrior%20distribution%20over%20network%20weights%7D%5C%5C%0A&amp;p(D%20%5Cmid%20%5Ctheta)%20:%20%5Ctext%7BLikelihood%20function%7D%5C%5C%0A%5Cend%7Balign%7D%0A"> As an illustrative example, we could have <img src="https://latex.codecogs.com/png.latex?%0Ap(D%20%5Cmid%20%5Ctheta)=%5Cprod_%7Bi=1%7D%5EN%20%5Cmathcal%7BN%7D%5Cleft(y_i%20%5Cmid%20f_%5Ctheta%5Cleft(X_i%5Cright),%20%5Csigma%5E2%5Cright)%5Ctag%7B2%7D%5Clabel%7B2%7D%0A"> i.e.&nbsp;the network output defines the mean of the target variable which, presumably, follows a Normal distribution. For the prior over the network weights - let’s presume we have K weights in total - we might choose <img src="https://latex.codecogs.com/png.latex?%0Ap(%5Ctheta)%20%5Csim%20%5Cprod_%7Bk=1%7D%5EK%20%5Cmathcal%7BN%7D%5Cleft(%5Ctheta_k%20%5Cmid%200,%20%5Ceta%5E2%5Cright)%20%5Ctag%7B3%7D%5Clabel%7B3%7D%0A"> The setup with (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B2%7D">) and (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B3%7D">) is fairly standard for Bayesian Neural Networks. Finding a posterior weight distribution for (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B1%7D">) turns out to be futile in any reasonable setting for Bayesian Networks. This is nothing new for Bayesian models either.</p>
<p>We now have a few options to deal with this issue, e.g.&nbsp;<a href="https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo?ref=sarem-seitz.com">MCMC</a>, <a href="https://en.wikipedia.org/wiki/Variational_Bayesian_methods?ref=sarem-seitz.com">Variational Bayes</a> or <a href="https://en.wikipedia.org/wiki/Maximum_a_posteriori_estimation?ref=sarem-seitz.com">Maximum a-posteriori estimation</a> (MAP). Technically, the latter only gives us a point estimate for the posterior maximum, not the full posterior distribution: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A&amp;%20%5Ctheta%5E%7BM%20A%20P%7D=%5Coperatorname%7Bargmax%7D_%5Ctheta%20p(%5Ctheta%20%5Cmid%20D)%20%5C%5C%0A=%20&amp;%20%5Coperatorname%7Bargmax%7D_%5Ctheta%20%5Cfrac%7Bp(D%20%5Cmid%20%5Ctheta)%20p(%5Ctheta)%7D%7Bp(D)%7D%0A%5Cend%7Baligned%7D%0A"> Given that a probability density is strictly positive over its domain, we can introduce logarithms: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A&amp;%20%5Ctheta%5E%7B%5Ctext%20%7BMAP%20%7D%7D=%5Coperatorname%7Bargmax%7D_%5Ctheta%20%5Clog%20p(%5Ctheta%20%5Cmid%20D)%20%5C%5C%0A=%20&amp;%20%5Coperatorname%7Bargmax%7D_%5Ctheta%5B%5Clog%20p(D%20%5Cmid%20%5Ctheta)+%5Clog%20p(%5Ctheta)-%5Clog%20p(D)%5D%20%5C%5C%0A=%20&amp;%20%5Coperatorname%7Bargmax%7D_%5Ctheta%5B%5Clog%20p(D%20%5Cmid%20%5Ctheta)+%5Clog%20p(%5Ctheta)%5D%0A%5Cend%7Baligned%7D%5Clabel%7B4%7D%5Ctag%7B4%7D%0A"> Since the last summand does not depend on the model parameters, it does not affect the argmax. Hence, we can leave it out.</p>
<section id="from-bayesian-map-to-regularized-mse" class="level3">
<h3 class="anchored" data-anchor-id="from-bayesian-map-to-regularized-mse">From Bayesian MAP to regularized MSE</h3>
<p>The purpose of equations (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B2%7D">) and (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B3%7D">) was only an illustrative one. Let us replace likelihood and prior with the following: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bgather%7D%0Ap(D%20%5Cmid%20%5Ctheta)=%5Cprod_%7Bi=1%7D%5EN%20%5Cfrac%7B1%7D%7BZ_1%7D%20e%5E%7B-%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2%7D%5Clabel%7B5%7D%5Ctag%7B5%7D%20%5C%5C%0Ap(%5Ctheta)=%5Cprod_%7Bk=1%7D%5EK%20%5Cfrac%7B1%7D%7BZ_2%7D%20e%5E%7B-%5Cleft(%5Cfrac%7B%5Ctheta_k%7D%7B%5Clambda%7D%5Cright)%5E2%7D,%20%5Clambda%3E0%5Clabel%7B6%7D%5Ctag%7B6%7D%0A%5Cend%7Bgather%7D%0A"> The Z-terms are normalization constants whose sole purpose is to yield valid probability densities.</p>
<p>Now we can plug (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B5%7D">) and (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B6%7D">) into (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B4%7D">): <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A%5Ctheta%5E%7B%5Ctext%20%7BMAP%20%7D%7D&amp;=%5Coperatorname%7Bargmax%7D_%5Ctheta%5Cleft%5B%5Clog%20%5Cprod_%7Bi=1%7D%5EN%20%5Cfrac%7B1%7D%7BZ_1%7D%20e%5E%7B-%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2%7D+%5Clog%20%5Cprod_%7Bk=1%7D%5EK%20%5Cfrac%7B1%7D%7BZ_2%7D%20e%5E%7B-%5Cleft(%5Cfrac%7B%5Ctheta_k%7D%7B%5Clambda%7D%5Cright)%5E2%7D%5Cright%5D%20%5C%5C%0A&amp;=%20%20%5Coperatorname%7Bargmax%7D_%5Ctheta%5Cleft%5B%5Csum_%7Bi=1%7D%5EN%20%5Clog%20%5Cfrac%7B1%7D%7BZ_1%7D%20e%5E%7B-%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2%7D+%5Csum_%7Bk=1%7D%5EK%20%5Clog%20%5Cfrac%7B1%7D%7BZ_2%7D%20e%5E%7B-%5Cleft(%5Cfrac%7B%5Ctheta_k%7D%7B%5Clambda%7D%5Cright)%5E2%7D%5Cright%5D%20%5C%5C%0A&amp;=%20%20%5Coperatorname%7Bargmax%7D_%5Ctheta%5Cleft%5B-N%20%5Clog%20Z_1-%5Csum_%7Bi=1%7D%5EN%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2-K%20%5Clog%20Z_2-%5Csum_%7Bk=1%7D%5EK%5Cleft(%5Cfrac%7B%5Ctheta_k%7D%7B%5Clambda%7D%5Cright)%5E2%5Cright%5D%20%5C%5C%0A&amp;=%20%20%5Coperatorname%7Bargmax%7D_%5Ctheta%5Cleft%5B-%5Csum_%7Bi=1%7D%5EN%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2-%5Cfrac%7B1%7D%7B%5Clambda%5E2%7D%20%5Csum_%7Bk=1%7D%5EK%20%5Ctheta_k%5E2%5Cright%5D%0A%5Cend%7Baligned%7D%0A"> As the maximum of a function is equal to the minimum of the negative function, we can write <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A%5Ctheta%5E%7B%5Ctext%20%7BMAP%20%7D%7D&amp;=%5Coperatorname%7Bargmin%7D_%5Ctheta%5Cleft%5B%5Csum_%7Bi=1%7D%5EN%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2+%5Cfrac%7B1%7D%7B%5Clambda%5E2%7D%5C%7C%5Ctheta%5C%7C_2%5E2%5Cright%5D%20%5C%5C%0A&amp;=%20%20%5Coperatorname%7Bargmin%7D_%5Ctheta%5Cleft%5B%5Cfrac%7B1%7D%7BN%7D%20%5Csum_%7Bi=1%7D%5EN%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2+%5Cfrac%7B1%7D%7BN%20%5Clambda%5E2%7D%5C%7C%5Ctheta%5C%7C_2%5E2%5Cright%5D%20%5C%5C%0A&amp;=%20%20%5Coperatorname%7Bargmin%7D_%5Ctheta%20M%20S%20E%5Cleft(f_%5Ctheta(X),%20y%5Cright)+%5Ctilde%7B%5Clambda%7D%5C%7C%5Ctheta%5C%7C_2%5E2%0A%5Cend%7Baligned%7D%0A"> The final term is nothing more than the standard MSE-objective with regularization. Hence, a regularized MSE objective is equivalent to a Bayesian MAP objective, given specific prior and likelihood.</p>
<p>If you reverse the above, you can find a prior-likelihood pair for almost any frequentist loss function. However, the frequentist objective typically only point you a point solution.</p>
<p>The Bayesian method, on the other hand, gives you a full posterior distribution with uncertainty intervals on top. Using <a href="https://en.wikipedia.org/wiki/Prior_probability?ref=sarem-seitz.com#Uninformative_priors">uninformative prior distributions</a> you can also remove the regularization term if necessary. We won’t cover a derivation here, however.</p>
</section>
</section>
<section id="the-price-of-bayesian-machine-learning-in-the-real-world" class="level2">
<h2 class="anchored" data-anchor-id="the-price-of-bayesian-machine-learning-in-the-real-world">The price of Bayesian Machine Learning in the real world</h2>
<p>So, in theory, Bayesian Machine Learning yields a more complete picture than the frequentist approach. The caveat however is the intractability of the posterior distribution and thus the need to approximate or estimate it.</p>
<p>Both approximation and estimation, however, inevitably lead to a loss in precision. Take for example the popular <a href="https://www.sarem-seitz.com/when-is-bayesian-machine-learning-actually-useful/#:~:text=So%2C%20in%20theory,don%27t%20understand%20everything%3A">variational inference approach for Bayesian Neural Networks</a>. In order to optimize the model, we need to approximate a so-called ELBO-objective by sampling from the variational distribution.</p>
<p>The usual ELBO-objective looks like this - don’t worry if you don’t understand everything: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Bequation%7D%0AELBO=%5Cmathbb%7BE%7D_%7Bq(%5Ctheta)%7D%5Bp(D%20%5Cmid%20%5Ctheta)%5D+K%20L(q(%5Ctheta)%20%5C%7C%20p(%5Ctheta))%20%5Clabel%7B7%7D%5Ctag%7B7%7D%0A%5Cend%7Bequation%7D%0A"> Our goal is to either maximize (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B7%7D">) or minimize its negative. However, the expectation term is typically not intractable. The easiest solution to this issue is the application of the <a href="https://gregorygundersen.com/blog/2018/04/29/reparameterization/?ref=sarem-seitz.com">reparameterization trick</a>. This allows us to estimate the ELBO via <img src="https://latex.codecogs.com/png.latex?%0A%5Cnabla_%5Ctheta%20E%20%5Chat%7BL%20B%20O%7D=%5Cfrac%7B1%7D%7BM%7D%20%5Csum_%7Bm=1%7D%5EM%20%5Cnabla_%5Ctheta%20p%5Cleft(D%20%5Cmid%20g_%5Ctheta%5Cleft(%5Cepsilon_m%5Cright)%5Cright)+%5Cnabla_%5Ctheta%20K%20L(q(%5Ctheta)%20%5C%7C%20p(%5Ctheta))%0A"> The linearity of the gradient operation then allows us to estimate the gradient of (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B7%7D">) via: <img src="https://latex.codecogs.com/png.latex?%0AE%20%5Chat%7BL%20B%20O%7D=%5Cfrac%7B1%7D%7BM%7D%20%5Csum_%7Bm=1%7D%5EM%20p%5Cleft(D%20%5Cmid%20g_%5Ctheta%5Cleft(%5Cepsilon_m%5Cright)%5Cright)+K%20L(q(%5Ctheta)%20%5C%7C%20p(%5Ctheta))%20%5Clabel%7B8%7D%5Ctag%7B8%7D%0A"> With this formula, we are basically sampling M gradients from a reparametrized distribution. Thereafter, we use those samples to calculate an ‘average’ gradient.</p>
<section id="unbiased-gradient-estimation-for-the-elbo" class="level3">
<h3 class="anchored" data-anchor-id="unbiased-gradient-estimation-for-the-elbo">Unbiased gradient estimation for the ELBO</h3>
<p>As demonstrated in the landmark paper by <a href="https://arxiv.org/abs/1312.6114?ref=sarem-seitz.com">Kingma and Welling (2013)</a>, we have <img src="https://latex.codecogs.com/png.latex?%0A%5Cmathbb%7BE%7D%5Cleft%5B%5Cnabla_%5Ctheta%20%5Chat%7BELBO%7D%5Cright%5D=%5Cnabla_%5Ctheta%20E%20L%20B%20O%20%5Clabel%7B9%7D%5Ctag%7B9%7D%0A"> In essence, equation (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B9%7D">) tells us that our sampling based gradient is, on average, equal to the true gradient. The problem with (<img src="https://latex.codecogs.com/png.latex?%5Cref%7B9%7D">) is that we don’t know anything about the higher order statistical moments of sampled gradient.</p>
<p>For example, our estimate might have a prohibitively large variance. Thus, while we gradient descend in the correct direction on average, we are doing so unreasonably inefficiently.</p>
<p>The additional randomness due to resampling also makes it difficult to find the right stopping time for gradient descent. Add that to the problem of non-convex loss-functions and your chances of underperforming against a non-Bayesian network are fairly high.</p>
<p>In summary, the true posterior in a Bayesian world could give us a more complete picture about the optimal parameters. Since we need to resort to approximation however, we most likely end up with worse performance than with the standard approach.</p>
</section>
</section>
<section id="when-does-bayesian-machine-learning-actually-make-sense" class="level2">
<h2 class="anchored" data-anchor-id="when-does-bayesian-machine-learning-actually-make-sense">When does Bayesian Machine Learning actually make sense?</h2>
<p>The above considerations beg the question of whether you want to even use Bayesian Machine Learning at all. Personally, I see two particular situations where this could be the case. Keep in mind, that what follows are rules of thumb. The actual decision for or against Bayesian Machine Learning should be based on the specific problem at hand.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-is-bayesian-machine-learning-actually-useful/bayesianML_decisiontree.png" class="img-fluid figure-img" alt="When is Bayesian Machine Learning actually useful? A simplistic decision tree for guidance - always adapt to your specific problem."></p>
<figcaption>When is Bayesian Machine Learning actually useful? A simplistic decision tree for guidance - always adapt to your specific problem.</figcaption>
</figure>
</div>
<section id="small-datasets-and-informative-prior-knowledge" class="level3">
<h3 class="anchored" data-anchor-id="small-datasets-and-informative-prior-knowledge">Small datasets and informative prior knowledge</h3>
<p>Let us move back to the regularized MSE derivation from before. Our objective was <img src="https://latex.codecogs.com/png.latex?%0A%5Coperatorname%7Bargmax%7D_%5Ctheta%5Cleft%5B%5Csum_%7Bi=1%7D%5EN%20%5Clog%20%5Cfrac%7B1%7D%7BZ_1%7D%20e%5E%7B-%5Cleft(f_%5Ctheta%5Cleft(x_i%5Cright)-y_i%5Cright)%5E2%7D+%5Csum_%7Bk=1%7D%5EK%20%5Clog%20%5Cfrac%7B1%7D%7BZ_2%7D%20e%5E%7B-%5Cleft(%5Cfrac%7B%5Ctheta_k%7D%7B%5Clambda%7D%5Cright)%5E2%7D%5Cright%5D%0A"> There is a clear tradeoff between the size of the dataset, <img src="https://latex.codecogs.com/png.latex?N">, and the amount of model parameters, <img src="https://latex.codecogs.com/png.latex?K">. For sufficiently large datasets, the prior on the right side becomes irrelevant for the MAP solution and vice-versa. This is also a key property for posterior inference in general.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-is-bayesian-machine-learning-actually-useful/posteriorprop.png" class="img-fluid figure-img" alt="Figurative behaviour of the posterior distribution. More data makes the posterior a closer fit to the data. Less data makes it more and more similar to the prior distribution."></p>
<figcaption>Figurative behaviour of the posterior distribution. More data makes the posterior a closer fit to the data. Less data makes it more and more similar to the prior distribution.</figcaption>
</figure>
</div>
<p>When the dataset is small in relation to the amount of model parameters, the posterior distribution will closely resemble the prior. Thus, if the prior distribution contains sensible information about the problem, we can mitigate the lack of data to some extent.</p>
<p>Take for example a plain linear regression model. Depending on the signal-to-noise ratio, a lot of data might be necessary before the parameters converge to the ‘best’ value. If you have a well reasoned prior distribution about that ‘best’ solution, your model might actually converge faster.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-is-bayesian-machine-learning-actually-useful/LinReg.png" class="img-fluid figure-img"></p>
<figcaption>1) Due to the small dataset and the random noise, the plain OLS regression line is off</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-is-bayesian-machine-learning-actually-useful/02LinRegPlusPrior.png" class="img-fluid figure-img"></p>
<figcaption>2) A well informed prior already produces a reasonable range before any model fit has occurred</figcaption>
</figure>
</div>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-is-bayesian-machine-learning-actually-useful/03LinRegPlusPost.png" class="img-fluid figure-img"></p>
<figcaption>3) After ‘training’, the Bayesian model produces a predictive mean regression line that comes much closer to the data generating regression line</figcaption>
</figure>
</div>
<p>As a result, convergence to the ‘best’ solution could be even slower than without any prior at all. There are more tools in Bayesian statistics to mitigate this reasonable objection and I am keen to cover this topic in a future article.</p>
<p>Also, even if you don’t want to go fully Bayesian, a MAP estimate could still be useful given high-quality prior knowledge.</p>
</section>
<section id="functional-priors-for-modern-bayesian-machine-learning" class="level3">
<h3 class="anchored" data-anchor-id="functional-priors-for-modern-bayesian-machine-learning">Functional priors for modern Bayesian Machine Learning</h3>
<p>A different issue might be the difficulty of expressing meaningful prior knowledge over neural network weights. If the model is a black-box, how could you actually tell it what to do? Apart from maybe some vague zero-mean Normal distribution over weights?</p>
<p>One promising direction to solve this issue could be <a href="https://arxiv.org/pdf/1903.05779.pdf?ref=sarem-seitz.com">functional Bayes</a>. In that approach, we only tell the network what outputs we expect for given inputs, based on our prior knowledge. Hence, we only care about the posterior functions that our model can express. The exact parameter posterior distribution is only of secondary interest.</p>
<p>In short, if you have only limited data but well-informed prior knowledge, Bayesian Machine Learning could actually help.</p>
</section>
<section id="importance-of-uncertainty-quantification" class="level3">
<h3 class="anchored" data-anchor-id="importance-of-uncertainty-quantification">Importance of uncertainty quantification</h3>
<p>For large datasets, the effect of your prior knowledge becomes less and less relevant. In that case, obtaining a frequentist solution might be fully sufficient and the superior approach. At some occasions however, the full picture - a.k.a. posterior uncertainty - could be important.</p>
<p>Consider the case of making a medical decision based on some Machine Learning model. In addition, the model does not produce a final decision but rather supports doctors in their judgement. If Bayesian point accuracy is at least close to a frequentist equivalent, uncertainty output can serve as useful extra information.</p>
<p>Presuming that the expert’s time is limited, they might want to take a closer look at highly uncertain model output. For output with low uncertainty, a quick sanity check might suffice. This of course requires the uncertainty estimates to be valid in the first place.</p>
<p>Hence, the Bayesian approach comes at the additional cost of monitoring uncertainty performance as well. This should of course be taken into account when deciding whether to use a Bayesian model or not.</p>
<p>Another example where Bayesian uncertainty can shine are time-series forecasting problems. If you model a time series in an autoregressive manner, i.e.&nbsp;you estimate <img src="https://latex.codecogs.com/png.latex?%0Ap%5Cleft(y_%7Bt+1%7D%20%5Cmid%20y_t,%20%5Cldots,%20y_%7Bt-j%7D%5Cright)%0A"> your model errors accumulate, the further ahead you are trying to forecast.</p>
<p>Even for a highly simple autoregressive time-series such as <img src="https://latex.codecogs.com/png.latex?%0Ay_%7Bt+1%7D=0.99%20%5Ccdot%20y_t+%5Cepsilon,%20%5Cquad%20%5Cepsilon%20%5Csim%20%5Cmathcal%7BN%7D(0,1)%0A"> a minuscule deviation in your estimated model <img src="https://latex.codecogs.com/png.latex?%0A%5Chat%7B%5Cmathbb%7BE%7D%7D%5Cleft%5By_%7Bt+1%7D%20%5Cmid%20y_t%5Cright%5D=1.02%20%5Ccdot%20y_t%0A"> can lead to catastrophic error accumulation in the long run: <img src="https://latex.codecogs.com/png.latex?%0A%5Cbegin%7Baligned%7D%0A&amp;%20%5Cleft(%5Cmathbb%7BE%7D%5Cleft%5By_%7Bt+T%7D%20%5Cmid%20y_t%5Cright%5D-%5Chat%7B%5Cmathbb%7BE%7D%7D%5Cleft%5By_%7Bt+T%7D%20%5Cmid%20y_t%5Cright%5D%5Cright)%5E2%20%5C%5C%0A=%20&amp;%20%5Cleft(0.99%5ET%20%5Ccdot%20y_t-1.02%5ET%20%5Ccdot%20y_t%5Cright)%5E2%20%5C%5C%0A%5Crightarrow%20&amp;%201.02%5E%7B2%20T%7D%20y%5E2%0A%5Cend%7Baligned%7D%0A"> A full posterior could give a much clearer picture of how other, similarly probable scenarios might play out. Even though the model might be wrong on average, we nevertheless get an idea of how different parameter estimates affect the forecast.</p>
<div class="quarto-figure quarto-figure-center">
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/images/when-is-bayesian-machine-learning-actually-useful/04BayesForecast.png" class="img-fluid figure-img"></p>
<figcaption>Bayesian forecast (purple) with uncertainty vs.&nbsp;frequentist point forecast (red). Although the frequentist forecast is slightly more accurate in the long run, the Bayesian uncertainty interval correctly includes the realized mean trajectory.</figcaption>
</figure>
</div>
<p>Finally, if the last paragraph has made you interested in trying out Bayesian Machine Learning, here is a great method to start with:</p>
</section>
</section>
<section id="bayesian-deep-learning-light-with-mc-dropout" class="level2">
<h2 class="anchored" data-anchor-id="bayesian-deep-learning-light-with-mc-dropout">Bayesian Deep Learning light with MC dropout</h2>
<p>Luckily, it is nowadays not too costly to train a Bayesian model. Unless you are working with very big models, the increased computational demand of Bayesian inference should not be too problematic.</p>
<p>One particularly straightforward approach to Bayesian inference is MC dropout. The latter was introduced by <a href="http://proceedings.mlr.press/v48/gal16.pdf?ref=sarem-seitz.com">Gal and Ghahramani</a> and has since become a fairly popular tool. In summary, the authors show that it is sensible to use Dropout during both training and inference of a model. In fact, this is proved to be equivalent to variational inference with a particular, Bernoulli-based, variational distribution.</p>
<p>Hence, MC Dropout can be a great starting point to make your Neural Network Bayes-ish without requiring too much effort upfront.</p>
<section id="pros-and-cons" class="level3">
<h3 class="anchored" data-anchor-id="pros-and-cons">Pros and cons</h3>
<p>On the one hand, MC dropout makes it quite straightforward to make an existing model Bayesian. You can simply re-train it with dropout and leave dropout turned on during inference. The samples you create that way can be seen as draws from the Bayesian posterior predictive distribution.</p>
<p>On the other hand, the variational distribution in MC dropout is based on Bernoulli random variables. This should - in theory - make it an even less accurate approximation than the common Gaussian variational distribution. However, building and using a this model is quite simple, requiring only plain Tensorflow or Pytorch.</p>
<p>Also, there has been some deeper criticism about the validity of the approach. There exists an interesting debate involving one of the authors <a href="https://twitter.com/ianosband/status/1014466510885216256?ref=sarem-seitz.com">here</a>.</p>
<p>Whatever you make out of that criticism, MC Dropout can be a helpful baseline for more sophisticated methods. Once you get a hang of Bayesian Machine Learning, you can try to improve performance with more sophisticated model from there.</p>
</section>
</section>
<section id="conclusion" class="level2">
<h2 class="anchored" data-anchor-id="conclusion">Conclusion</h2>
<p>I hope that this article has given you some insights on the usefulness of Bayesian Machine Learning. Certainly, it is no magic bullet and there are many occasions where it might not be the right choice. If you carefully weigh in the pros and cons though, Bayesian methods can be a highly useful tool.</p>
<p>Also, I am happy to have a deeper discussion on the topic, either in the comments or via private channels.</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> Gelman, Andrew, et al.&nbsp;“Bayesian data analysis”. CRC press, 2013.</p>
<p><strong>[2]</strong> Kruschke, John. “Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan.” Academic Press, 2014.</p>
<p><strong>[3]</strong> Sivia, Devinderjit, and John Skilling. “Data analysis: a Bayesian tutorial.” OUP Oxford, 2006.</p>


</section>

 ]]></description>
  <category>Bayesian</category>
  <guid>https://www.sarem-seitz.com/posts/when-is-bayesian-machine-learning-actually-useful.html</guid>
  <pubDate>Sat, 22 Jan 2022 00:00:00 GMT</pubDate>
</item>
<item>
  <title>A Gaussian Process model for heteroscedasticity</title>
  <dc:creator>Sarem </dc:creator>
  <link>https://www.sarem-seitz.com/posts/a-gaussian-process-model-for-heteroscedasticity.html</link>
  <description><![CDATA[ 





<section id="introduction" class="level2">
<h2 class="anchored" data-anchor-id="introduction">Introduction</h2>
<p>A common phenomenon when working on continuous regression problems is the non-constant residual variance, also known as heteroscedasticity. While <a href="https://en.wikipedia.org/wiki/Heteroscedasticity?ref=sarem-seitz.com">heteroscedasticity</a> is often seen in Statistics and Econometrics, it doesn’t seem to receive as much attention in mainstream Machine Learning and Data Science literature.</p>
<p>Although predicting the mean via MSE-minimisation is often sufficient and more pragmatic, a proper treatment of the variance can be helpful at times. See for example this past blog post of mine for more thoughts on this topic.</p>
<p>In this article I want to show an example of how we can use Gaussian Processes to model heteroscedastic data. Since explaining every theoretical aspect would go far beyond the scope of this post, I recommend reading the references if you are interested in such models. First, let us start with a brief problem definition.</p>
</section>
<section id="problem-definition" class="level2">
<h2 class="anchored" data-anchor-id="problem-definition">Problem definition</h2>
<p>At the heart of non-constant variance models lies the assumption of some functional relation between the input data and the variance of the target variable. Presuming also a Gaussian target variable, we can construct the following probabilistic setup:</p>
<p><img src="https://latex.codecogs.com/png.latex?y%5Csim%5Cmathcal%7BN%7D%5Cleft(m(X),%5Csigma%5E2(X)%5Cright)"></p>
<p>Put plainly, given some input data, the corresponding target should be Gaussian with mean and variance being arbitrary functions of our inputs. Since our focus today is on the variance, let us simplify things a little with a zero-mean function, i.e.</p>
<p><img src="https://latex.codecogs.com/png.latex?y%5Csim%5Cmathcal%7BN%7D%5Cleft(0,%5Csigma%5E2(X)%5Cright)"></p>
<p>Our task is now to find a suitable function for sigma squared.</p>
<p>If, ex-ante, we don’t know much about our target function, whatever model we come up with should account for our uncertainty about the actual functional form of sigma squared. This is also known as <a href="http://www.ce.memphis.edu/7137/PDFs/Abrahamson/C05.pdf?ref=sarem-seitz.com">epistemic uncertainty</a> and one of the main considerations in <a href="https://algorithmia.com/blog/bayesian-machine-learning?ref=sarem-seitz.com">Bayesian Machine Learning</a>. In simple terms, we now don’t expect that a single model would best describe our target function anymore.</p>
<p>Instead, a — possibly infinitely large — set of models is considered and our goal is to place a probability distribution (a.k.a. posterior distribution) on this set such that those models that best describe the data (a.k.a. likelihood) given the assumptions we made (a.k.a. prior distributions) are the most likely ones.</p>
<p>This is usually done in weight space by defining our set of models in an implicit manner via the sets of parameters that describe the models’ behaviour — probably the most famous example in Machine Learning are <a href="https://medium.com/neuralspace/bayesian-neural-network-series-post-1-need-for-bayesian-networks-e209e66b70b2?ref=sarem-seitz.com">Bayesian Neural Networks</a>. Another, more abstract approach is to directly work in function space, i.e.&nbsp;we now explicitly look for the most likely functions without requiring parameters to describe them in the first place.</p>
<p>Since we are working in the Bayesian domain, this also means that prior and posterior distributions aren’t put over parameters anymore but also directly over functions. One of the most iconic frameworks for such modelling is <a href="https://distill.pub/2019/visual-exploration-gaussian-processes/?ref=sarem-seitz.com">Gaussian Process (GP) regression</a>.</p>
<p>If this is a new concept to you and sounds confusing, I recommend to not worry about the underlying assumptions for now and just look at the formulas. One of the — by number of citations — most popular books on Gaussian Process models, Gaussian Processes for Machine Learning (GPML) provides a very clear introduction to the theoretical setup and is completely open source. To prevent this article from overbloating, I will not go too much into details and rather suggest you study the topics you don’t understand yourself.</p>
</section>
<section id="the-model" class="level2">
<h2 class="anchored" data-anchor-id="the-model">The model</h2>
<p>Our goal will be to model the varying variance of the target variable through a GP, which looks as follows:</p>
<p><img src="https://latex.codecogs.com/png.latex?y%5Csim%20%5Cmathcal%7BN%7D(0,f%5E%7Bexp%7D(X))"></p>
<p><img src="https://latex.codecogs.com/png.latex?f(%5Ccdot)%5Csim%20%5Cmathcal%7BGP%7D(0,k(%5Ccdot,%5Ccdot)+%5Cnu%5Ccdot%5Cdelta_%7Bij%7D),%5C,f%5E%7Bexp%7D(X)=exp(f(X))"></p>
<p>This implies that the logarithm of our variance function is a GP — we need to squash the raw GP through an exponential to ensure that the variance will always be greater than zero. Any other function that maps the real line to the positive reals will do here but the exponential arguably the most popular one.</p>
<p>The above also implies that the GP is actually a latent component of our model that we only observe indirectly from the data we collect. Finally, we presumed additional noise on the GP kernel via the delta summand which makes the model more stable in practice.</p>
<p>We can then derive the posterior distribution derived via Bayes’ theorem as follows:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(f%7CX,y)=%5Cfrac%7B%5Cmathcal%7BN%7D(y%7C0,f%5E%7Bexp%7D(X))%5Ccdot%5Cmathcal%7BGP%7D(f%7C0,k(%5Ccdot,%5Ccdot)+%5Cnu%5E2%5Ccdot%5Cdelta_%7Bij%7D))%7D%7Bp(y%7CX)%7D"></p>
<p>While it is possible to derive the left-hand side in closed form for some basic GP models, we cannot do so in our case. Instead we will apply <a href="https://cedar.buffalo.edu/~srihari/CSE574/Chap4/4.4-Laplace.pdf?ref=sarem-seitz.com">Laplace Approximation</a> and approximate it through a synthetic multivariate Normal</p>
<p><img src="https://latex.codecogs.com/png.latex?p(f%7CX,y)%5Capprox%20%5Cmathcal%7BN%7D(f%7C%5Chat%7Bf%7D,A)"></p>
<p>The exact steps for Laplace Approximation are explained in <a href="http://www.gaussianprocess.org/gpml/chapters/RW3.pdf?ref=sarem-seitz.com">Chapter 3 of the GPML book</a> for a binary classification model and we only need to adjust the approach to our model.</p>
<p>In summary, the mean parameter of our approximation should match the mode of the posterior, while its covariance matrix is the negative inverse of the Hessian matrix of our data log-likelihood function. We have:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Chat%7Bf%7D=argmax_f%20%5Clog%5Cmathcal%7BN%7D(y%7C0,f%5E%7Bexp%7D(X))+%5Clog%5Cmathcal%7BGP%7D(f%7C0,k(%5Ccdot,%5Ccdot)+%5Cnu%5E2%5Ccdot%5Cdelta_%7Bij%7D))"></p>
<p><img src="https://latex.codecogs.com/png.latex?A%5E%7B-1%7D=-%5Cnabla%5E2%20%5Clog%20%5Cmathcal%7BN%7D(y%7C0,f%5E%7Bexp%7D(X))"></p>
<p>The first equation is derived from the fact that the denominator of the posterior formula does not depend on our target and by monotonicity of the logarithm function. The latter equation is derived from a second-order Taylor-expansion around the maximum of the posterior function.</p>
<p>To find the approximate mean and optimal kernel hyper-parameters for some example data later on, we will plug in the whole loss into an automatic differentiation package and let the computer do the rest. For the covariance matrix of our approximation, we need to actually calculate the Hessian matrix.</p>
<p>A common simplification for GP models is the assumption of independent observations of the target variable given a realisation of the latent GP:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(y%7Cf)=%5Cprod_%7Bi=1%7D%5EN%20p(y_i%7Cf)"></p>
<p>This allows us to simplify the Hessian matrix to be zero everywhere, except for its diagonal which is the second-order derivative of the log-likelihood function with respect to the GP:</p>
<p><img src="https://latex.codecogs.com/png.latex?H_%7Bii%7D=%5Cfrac%7B%5Cpartial%5E2%7D%7B%5Cpartial%20f_i%5E2%7D%5Clog%20p(y_i%7Cf_i)"></p>
<p>The right-hand side can be derived by differentiating the standard Gaussian log-likelihood twice with respect to the variance while accounting for our exponential transform:</p>
<p><img src="https://latex.codecogs.com/png.latex?%5Cbegin%7Bgathered%7D%0A%5Cfrac%7B%5Cpartial%5E2%7D%7B%5Cpartial%20f_i%5E2%7D%5Clog%20p(y_i%7Cf_i)%5C%5C%0A=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20f_i%7D%20%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20f_i%7D%5Clog%20p(y_i%7Cf_i)%5C%5C%0A=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20f_i%7D%20%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20f_i%7D%5Cleft(%20-0.5%20%5Clog%20(2%5Cpi)-0.5f_i%20-%200.5%5Cfrac%7By_i%5E2%7D%7Bexp(f_i)%7D%5Cright)%5C%5C%0A=%5Cfrac%7B%5Cpartial%7D%7B%5Cpartial%20f_i%7D%20%5Cleft(-0.5%20+%200.5%5Cfrac%7By_i%5E2%7D%7Bexp(f_i)%7D%5Cright)%5C%5C%0A=-%200.5%5Cfrac%7By_i%5E2%7D%7Bexp(f_i)%7D%0A%5Cend%7Bgathered%7D%0A"></p>
<p>Which yields</p>
<p><img src="https://latex.codecogs.com/png.latex?H_%7Bii%7D=-%200.5%5Cfrac%7By_i%5E2%7D%7Bexp(f_i)%7D"></p>
<p>and</p>
<p><img src="https://latex.codecogs.com/png.latex?A_%7Bii%7D=2%5Ccdot%5Cfrac%7Bexp(f_i)%7D%7By_i%5E2%7D"></p>
<p>Finally, we need to derive the so-called posterior predictive distribution i.e.&nbsp;our predictions for new, unobserved inputs:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(y%5E*%7CX%5E*,X,Y)"></p>
<p>I will only state the results from <a href="http://www.gaussianprocess.org/gpml/chapters/RW3.pdf?ref=sarem-seitz.com">GPML, chapter 3</a> for our setup, without the preceding derivations. First, we need to calculate the posterior predictive distribution for the latent GP which, using our approximation from above, is yet another GP:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(f%5E*%7CX%5E*,X,Y)=%5Cmathcal%7BGP%7D(f%5E*%7C%5Cmu%5E*,S%5E*)"></p>
<p><img src="https://latex.codecogs.com/png.latex?%5Cmu%5E*=K_%7B*N%7DK_%7BNN%7D%5E%7B-1%7D%5Chat%7Bf%7D"></p>
<p><img src="https://latex.codecogs.com/png.latex?S%5E*=K_%7B**%7D-K_%7B*N%7D(K_%7BNN%7D+A)%5E%7B-1%7DK_%7B*N%7D%5ET"></p>
<p>where the K-variables denote the kernel covariance gram-matrices for training and evaluation dataset and the kernel cross-covariance matrix between training and evaluation datasets. If you are familiar with GP-regression, you can see that the posterior mean and covariance terms are almost the same as in the standard case, except that we accounted for the mean and covariance of our Laplace approximation.</p>
<p>Finally, we can derive the posterior predictive distribution for new data by marginalising out the GP posterior predictive function:</p>
<p><img src="https://latex.codecogs.com/png.latex?p(y%5E*%7CX%5E*,X,Y)=%5Cint%20p(y%5E*%7Cf%5E*)p(f%5E*%7C%5Cmu%5E*,S%5E*)df%5E*%20=%20%5Cint%5Cmathcal%7BN%7D(y%5E*%7C0,f%5E*)%5Cmathcal%7BGP%7D(f%5E*%7C%5Cmu%5E*,S%5E*)df%5E*"></p>
<p>This integral is also intractable — luckily, since we only want to evaluate the posterior predictive distribution, we can sample from the target distribution via Monte Carlo sampling.</p>
<p>To demonstrate this approach in practice, I implemented a brief example in Julia.</p>
</section>
<section id="a-quick-example-using-julia" class="level2">
<h2 class="anchored" data-anchor-id="a-quick-example-using-julia">A quick example using Julia</h2>
<p>The data is a simple, 1D toy-example with 200 observations and generating distributions</p>
<p><img src="https://latex.codecogs.com/png.latex?y%5Csim%20%5Cmathcal%7BN%7D(0,sin%5E2(2%20*X)),%5Cquad%20X%5Csim%5Cmathcal%7BU%7D(-3,3)"></p>
<p>i.e.&nbsp;the input variable is sampled uniformly between -3 and 3 and the target is drawn from a zero-mean Gaussian with periodic variance:</p>
<div id="cell-7" class="cell" data-execution_count="1">
<div class="sourceCode cell-code" id="cb1" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb1-1"><span class="im" style="color: #00769E;
background-color: null;
font-style: inherit;">using</span> <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Flux</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Zygote</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Distributions</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">DistributionsAD</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">FastGaussQuadrature</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Plots</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">StatsPlots</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">KernelFunctions</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">LinearAlgebra</span>, <span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span></span></code></pre></div>
</div>
<div id="cell-8" class="cell" data-execution_count="2">
<div class="sourceCode cell-code" id="cb2" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb2-1"><span class="bu" style="color: null;
background-color: null;
font-style: inherit;">Random</span>.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">seed!</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">987</span>)</span>
<span id="cb2-2"></span>
<span id="cb2-3">sample_size <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">200</span></span>
<span id="cb2-4"></span>
<span id="cb2-5">X <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,sample_size)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">6</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span></span>
<span id="cb2-6">Xline <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Matrix</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">transpose</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">collect</span>(<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3.9</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3.9</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>])) <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#small, interpolated 1D grid for evalutaion</span></span>
<span id="cb2-7">Xline_train <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Xline[(Xline<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.&gt;=-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.&amp;</span> (Xline<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.&lt;=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)]</span>
<span id="cb2-8">y <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,sample_size)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sin</span>.(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span>X)</span>
<span id="cb2-9"></span>
<span id="cb2-10"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(Xline_train,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(Xline_train));ribbon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sin</span>.(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span>Xline_train)))</span>
<span id="cb2-11"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">scatter!</span>(X[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none, fmt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>png,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>),color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="2">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/a-gaussian-process-model-for-heteroscedasticity_files/figure-html/cell-3-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>To fully define the GP, we also need to specify the kernel function — here I chose a standard Square-Exponential (SE) kernel plus the already mentioned additive noise term, i.e.</p>
<p><img src="https://latex.codecogs.com/png.latex?k(x,x')%20=%20s%5Ccdot%5Cexp%5Cleft(%5Cfrac%7B(x-x')%5E2%7D%7Bl%7D%5Cright)%20+%20%5Cnu%5Cdelta_%7Bxx'%7D"></p>
<p>where all three hyper-parameters need to be positive. We now have all the formulas we need to define the necessary functions and structs (Julia’s counterpart to classes in object oriented languages)</p>
<div id="cell-10" class="cell" data-execution_count="3">
<div class="sourceCode cell-code" id="cb3" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb3-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">mutable struct</span> SEKernel <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;:</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;"> KernelFunctions.Kernel</span></span>
<span id="cb3-2">    l_log</span>
<span id="cb3-3">    s_log</span>
<span id="cb3-4"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb3-5"></span>
<span id="cb3-6">Flux.<span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@functor</span> SEKernel</span>
<span id="cb3-7"></span>
<span id="cb3-8"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">SEKernel</span>() <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">SEKernel</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb3-9"></span>
<span id="cb3-10"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#using KernelFunctions.jl for the primitives, we specify the raw SE-kernel here - the noise term will be added in the full model </span></span>
<span id="cb3-11">KernelFunctions.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kernelmatrix</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">SEKernel</span>,x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Matrix</span>,y<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Matrix</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>.(m.s_log)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>.(</span>
<span id="cb3-12">                                                                    <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-sum</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>.(m.l_log)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span>(Flux.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unsqueeze</span>(x,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3</span>)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span>Flux.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">unsqueeze</span>(y,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>,<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb3-13">                                                                )</span>
<span id="cb3-14"></span>
<span id="cb3-15">KernelFunctions.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kernelmatrix</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">SEKernel</span>,x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">Matrix</span>) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> KernelFunctions.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kernelmatrix</span>(m,x,x);</span></code></pre></div>
</div>
<div id="cell-11" class="cell" data-execution_count="4">
<div class="sourceCode cell-code" id="cb4" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb4-1"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">mutable struct</span> HeteroscedasticityGPModel</span>
<span id="cb4-2">    </span>
<span id="cb4-3">    kern</span>
<span id="cb4-4">    </span>
<span id="cb4-5">    f_hat</span>
<span id="cb4-6">    </span>
<span id="cb4-7">    nu_log</span>
<span id="cb4-8">    </span>
<span id="cb4-9"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb4-10">Flux.<span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@functor</span> HeteroscedasticityGPModel</span>
<span id="cb4-11"></span>
<span id="cb4-12"></span>
<span id="cb4-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># 'n' specifies the dimensionality or our Laplace Approximation - needs to be equal to the size of the training dataset</span></span>
<span id="cb4-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">HeteroscedasticityGPModel</span>(n) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">HeteroscedasticityGPModel</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">SEKernel</span>(),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,n),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>))</span>
<span id="cb4-15"></span>
<span id="cb4-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># this is the log-likelihood term we need for calculating the optimal mean parameter of the Laplace Approximation - Zygote.jl and Flux.jl will be used for Autodiff </span></span>
<span id="cb4-17"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">function</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">loss</span>(m<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">::</span><span class="dt" style="color: #AD0000;
background-color: null;
font-style: inherit;">HeteroscedasticityGPModel</span>,y,X)</span>
<span id="cb4-18">    _,N <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">size</span>(m.f_hat)</span>
<span id="cb4-19">    K<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kernelmatrix</span>(m.kern,X)</span>
<span id="cb4-20">        </span>
<span id="cb4-21">    likelihood_term <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sum</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormal</span>.(y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(N),m.f_hat[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]))</span>
<span id="cb4-22">    prior_term <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Distributions.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">logpdf</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MvNormal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(N),K<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(N)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>.(m.nu_log)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])),m.f_hat[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>])</span>
<span id="cb4-23">    </span>
<span id="cb4-24">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">return</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>likelihood_term <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> prior_term <span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#negative since Flux.Optimise minimizes the target but we want to maximize it</span></span>
<span id="cb4-25"><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>;</span></code></pre></div>
</div>
<div id="cell-12" class="cell" data-execution_count="5">
<div class="sourceCode cell-code" id="cb5" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb5-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#Manual derivation of log-likelihood and hessian diagonal of the log-likelihood with respect to f; the gradient is also included </span></span>
<span id="cb5-2"></span>
<span id="cb5-3"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormal</span>(x,m,s) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">log</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>. <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3.14</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(s)) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">./</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(s))<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>m)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>.</span>
<span id="cb5-4"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormgrad</span>(x,m,s) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(s)<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>m)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb5-5"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormhess</span>(x,m,s) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.5</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(s)<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*</span>(x<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>m)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">^</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span></span>
<span id="cb5-6"></span>
<span id="cb5-7"></span>
<span id="cb5-8"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#finite differences test to ensure that the formulas for the gradient and hessian diagonal are correct</span></span>
<span id="cb5-9">eps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-10</span></span>
<span id="cb5-10"><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@assert</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">abs</span>((<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormal</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-llnormal</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>eps))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>eps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormgrad</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span></span>
<span id="cb5-11"><span class="pp" style="color: #AD0000;
background-color: null;
font-style: inherit;">@assert</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">abs</span>((<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormgrad</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-llnormgrad</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span>eps))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">/</span>eps <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormhess</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>)) <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">&lt;</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>;</span></code></pre></div>
</div>
<div id="cell-13" class="cell" data-execution_count="6">
<div class="sourceCode cell-code" id="cb6" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb6-1">model <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">HeteroscedasticityGPModel</span>(sample_size)</span>
<span id="cb6-2">params <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Flux.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">params</span>(model);</span></code></pre></div>
</div>
<div id="cell-14" class="cell" data-execution_count="7">
<div class="sourceCode cell-code" id="cb7" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb7-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#test that the Zygote.jl gradients are actually working</span></span>
<span id="cb7-2">Zygote.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gradient</span>(()<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-&gt;loss</span>(model,y,X),params);</span></code></pre></div>
</div>
<div id="cell-15" class="cell" data-execution_count="8">
<div class="sourceCode cell-code" id="cb8" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb8-1">opt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ADAM</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0.01</span>)</span>
<span id="cb8-2"></span>
<span id="cb8-3"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">for</span> i <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2500</span></span>
<span id="cb8-4">    </span>
<span id="cb8-5">    grads <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> Zygote.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">gradient</span>(()<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-&gt;loss</span>(model,y,X),params)</span>
<span id="cb8-6">    Flux.Optimise.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">update!</span>(opt,params,grads)</span>
<span id="cb8-7">    </span>
<span id="cb8-8">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">if</span> i<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">%</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">250</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">==</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span></span>
<span id="cb8-9">        <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">println</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">loss</span>(model,y,X))</span>
<span id="cb8-10">    <span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span></span>
<span id="cb8-11"><span class="cf" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>;</span></code></pre></div>
<div class="cell-output cell-output-stdout">
<pre><code>248.54495027449855
11.759623403151409
-221.67423883166694
-451.1514391636018
-659.5369049784196
-795.0349981998211
-752.3599410064332
-937.6992763973955
-922.2646364618785
-981.90093613224</code></pre>
</div>
</div>
<div id="cell-16" class="cell" data-execution_count="9">
<div class="sourceCode cell-code" id="cb10" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb10-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#calculate the posterior distribution of f evaluated at X_star</span></span>
<span id="cb10-2"></span>
<span id="cb10-3">M,N <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">size</span>(model.f_hat)</span>
<span id="cb10-4"></span>
<span id="cb10-5">K<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">kernelmatrix</span>(model.kern,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(X,Xline))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>(model.nu_log[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">N+size</span>(Xline)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>]))</span>
<span id="cb10-6">A <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inv</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">-Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">llnormhess</span>.(y[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(N),model.f_hat[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]))<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(sample_size)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>))</span>
<span id="cb10-7"></span>
<span id="cb10-8"></span>
<span id="cb10-9"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#kernel covariance matrices as </span></span>
<span id="cb10-10"></span>
<span id="cb10-11">K_NN <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> K[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>sample_size,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>sample_size]</span>
<span id="cb10-12">K_star_star <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> K[sample_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>,sample_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>]</span>
<span id="cb10-13">K_star_N <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> K[sample_size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">+</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">end</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>sample_size]</span>
<span id="cb10-14"></span>
<span id="cb10-15"></span>
<span id="cb10-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#posterior mean and covariance of the GP evaluated at `Xline`</span></span>
<span id="cb10-17">mu_star <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> model.<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">f_hat*inv</span>(K_NN<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(sample_size)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>))<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*transpose</span>(K_star_N)</span>
<span id="cb10-18">S_star <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> K_star_star<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.-</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">K_star_N*inv</span>(K_NN<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">inv</span>(K_NN<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span>A <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(sample_size)<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>)))<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">*transpose</span>(K_star_N)</span>
<span id="cb10-19"></span>
<span id="cb10-20"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#posterior probability distribution of f, evaluated at X_star</span></span>
<span id="cb10-21">p_f_star <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">MvNormal</span>(mu_star[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Symmetric</span>(S_star<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.+</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">Diagonal</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">ones</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">size</span>(Xline)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1e-6</span>)));</span></code></pre></div>
</div>
<p>The resulting functional posterior predictive distribution after optimising the above kernel hyper-parameters and the Laplace Approximation looks as follows:</p>
<div id="cell-18" class="cell" data-execution_count="10">
<div class="sourceCode cell-code" id="cb11" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb11-1"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;"># Monte Carlo sampling from the posterior predictive distribution for y_star</span></span>
<span id="cb11-2">mc_samples <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> [<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">randn</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">size</span>(Xline)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sqrt</span>.((<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">exp</span>.(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">rand</span>(p_f_star)))) for _ <span class="kw" style="color: #003B4F;
background-color: null;
font-weight: bold;
font-style: inherit;">in</span> <span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span><span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">100000</span>]</span>
<span id="cb11-3">mc_sample_matrix <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">hcat</span>(mc_samples<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">...</span>)</span>
<span id="cb11-4"></span>
<span id="cb11-5"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#posterior predictive mean is zero by our model definition</span></span>
<span id="cb11-6">posterior_predictive_mean <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">size</span>(Xline)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>])[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb11-7"></span>
<span id="cb11-8">posterior_predictive_std <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">std</span>(mc_sample_matrix,dims<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span>)[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]</span>
<span id="cb11-9"></span>
<span id="cb11-10"></span>
<span id="cb11-11"></span>
<span id="cb11-12"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#plot(Xline[:],posterior_predictive_mean,legend=:none,ribbon=(posterior_predictive_mean.-ci_90percent_lower, ci_90percent_upper.-posterior_predictive_mean), fmt = :png,size=(1000,500),</span></span>
<span id="cb11-13"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#color=:green)</span></span>
<span id="cb11-14"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(Xline_train,<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">zeros</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">length</span>(Xline_train));ribbon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span> <span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">sin</span>.(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.*</span>Xline_train)))</span>
<span id="cb11-15"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot!</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],posterior_predictive_mean,legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,ribbon<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">2</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">*</span> posterior_predictive_std, fmt <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span> <span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>png,size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1000</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>), color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red)</span>
<span id="cb11-16"><span class="co" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">#scatter!(X[:],y[:],color=:red)</span></span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="10">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/a-gaussian-process-model-for-heteroscedasticity_files/figure-html/cell-11-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>To see what happens for data that lies outside of the range of our training data, the evaluation was performed on the interval. As you can see, the posterior predictive variance shows a sharp increase the further we look into the unknown.</p>
<p>This is exactly what should happen under the influence of epistemic uncertainty. To some extent, a model can learn which functions describe the target function in close distance to the training data. On the other hand, the set of candidate functions that might equally well describe data outside our observations grows larger the farther we move away from the training data.</p>
<p>Put simple, the less similar our test data is to the training data, the more uncertain we should be about our predictions. This uncertainty is expressed by the variance of the posterior predictive distribution — larger variance implies larger uncertainty.</p>
<p>We can also see this quite well by comparing the posterior predictive density for an X that lies in the center of observed data vs.&nbsp;the posterior predictive density for an X rather outside that range:</p>
<div id="cell-20" class="cell" data-execution_count="11">
<div class="sourceCode cell-code" id="cb12" style="background: #f1f3f5;"><pre class="sourceCode julia code-with-copy"><code class="sourceCode julia"><span id="cb12-1"><span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">plot</span>(<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">density</span>(mc_sample_matrix[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">findall</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.==</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">0</span>.)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>),<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">density</span>(mc_sample_matrix[<span class="fu" style="color: #4758AB;
background-color: null;
font-style: inherit;">findall</span>(Xline[<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>]<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">.==-</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">3.5</span>)[<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1</span>],<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">:</span>],legend<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>none,color<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>red,lw<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span><span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">4</span>),</span>
<span id="cb12-2">size<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=</span>(<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">1400</span>,<span class="fl" style="color: #AD0000;
background-color: null;
font-style: inherit;">500</span>),fmt<span class="op" style="color: #5E5E5E;
background-color: null;
font-style: inherit;">=:</span>png)</span></code></pre></div>
<div class="cell-output cell-output-display" data-execution_count="11">
<div>
<figure class="figure">
<p><img src="https://www.sarem-seitz.com/posts/a-gaussian-process-model-for-heteroscedasticity_files/figure-html/cell-12-output-1.png" class="img-fluid figure-img"></p>
</figure>
</div>
</div>
</div>
<p>It is quite obvious that the posterior predictive density at -3.5 implies a much broader range of potential values for y than the posterior predictive density at zero. Being able to quantify such uncertainty is one of the most intriguing features of Bayesian Machine Learning and I can highly recommend to dive deeper into this vast topic.</p>
</section>
<section id="going-further" class="level2">
<h2 class="anchored" data-anchor-id="going-further">Going further</h2>
<p>It is quite obvious that the example we used was only a toy dataset and doesn’t yet prove anything about the real-world capabilities of the proposed model. If you are interested, feel free to use and modify the code and try the model on something more realistic.</p>
<p>One potential application for such models would be financial time-series data which are quite well known to exhibit highly variable variance in periods of crisis. While GARCH models are often consider state-of-the-art here, a GP model might be an interesting alternative. Another possible improvement for general continuous regression problems would be to also model the data mean as a GP.</p>
<p>A final word about scalability: Plain GP models like the one we discussed here are quite infamous for being infeasible for larger datasets. Luckily, many smart people have developed methods to solve these issues, at least to some extent. In case you are interested in such approaches, you can find an overview in <a href="http://gpss.cc/gpss19/slides/Dai2019.pdf?ref=sarem-seitz.com">these slides from the Gaussian Process Summer School 2019</a>.</p>
<p>And that’s it for today. Thanks for reading this far and let me know in the comments if you have any questions or found any errors in this post</p>
</section>
<section id="references" class="level2">
<h2 class="anchored" data-anchor-id="references">References</h2>
<p><strong>[1]</strong> RASMUSSEN, Carl Edward. Gaussian processes in machine learning. In: Summer school on machine learning. Springer, Berlin, Heidelberg, 2003.</p>
<p><strong>[2]</strong> Bollerslev, Tim. Generalized autoregressive conditional heteroskedasticity. Journal of econometrics, 1986, 31. 3, p.&nbsp;307-327.</p>


</section>

 ]]></description>
  <category>Bayesian</category>
  <category>Gaussian Processes</category>
  <guid>https://www.sarem-seitz.com/posts/a-gaussian-process-model-for-heteroscedasticity.html</guid>
  <pubDate>Mon, 28 Jun 2021 00:00:00 GMT</pubDate>
</item>
</channel>
</rss>
