# Phorgy Phynance

## More fun with maximum likelihood estimation

with one comment

A while ago, I wrote a post

### Fun with maximum likelihood estimation

where I jotted down some notes. I ended the post with the following:

Note: The first time I worked through this exercise, I thought it was cute, but I would never compute $\mu$ and $\sigma^2$ as above so the maximum likelihood estimation, as presented, is not meaningful to me. Hence, this is just a warm up for what comes next. Stay tuned…

Well, it has been over a year and I’m trying to get a friend interested in MLE for a side project we might work on together, so thought it would be good to revisit it now.

To briefly review, the probability of observing $N$ independent samples $X\in\mathbb{R}^N$ may be approximated by

\begin{aligned} P(X|\theta) = \prod_{i = 1}^N P(x_i|\theta) = \left(\Delta x\right)^N \prod_{i=1}^N \rho(x_i|\theta),\end{aligned}

where $\rho(x|\theta)$ is a probability density and $\theta$ represents the parameters we are trying to estimate. The key observation becomes clear after a slight change in perspective.

If we take the $N$th root of the above probability (and divide by $\Delta x$), we obtain the geometric mean of the individual densities, i.e.

\begin{aligned} \langle \rho(X|\theta)\rangle_{\text{geom}} = \prod_{i=1}^N \left[\rho(x_i|\theta)\right]^{1/N}.\end{aligned}

In computing the geometric mean above, each sample is given the same weighting, i.e. $1/N$. However, we may have reason to want to weigh some samples heavier than others, e.g. if we are studying samples from a time series, we may want to weigh the more recent data heavier. This inspired me to replace $1/N$ with an arbitrary weight $w_i$ satisfying

\begin{aligned} w_i\ge 0,\quad\text{and}\quad \sum_{i=1}^N w_i = 1.\end{aligned}

With no apologies for abusing terminology, I’ll refer to this as the likelihood function

\begin{aligned} \mathcal{L}(\theta) = \prod_{i=1}^N \rho(x_i|\theta)^{w_i}.\end{aligned}

Replacing $w_i$ with $1/N$ would result in the same parameter estimation as the traditional maximum likelihood method.

It is often more convenient to work with log likelihoods, which has an even more intuitive expression

\begin{aligned}\log\mathcal{L}(\theta) = \sum_{i=1}^N w_i \log \rho(x_i|\theta),\end{aligned}

i.e. the log likelihood is simply the weighted (arithmetic) average of the log densities.

I use this approach to estimate stable density parameters for time series analysis that is more suitable for capturing risk in the tails. For instance, I used this technique when generating the charts in a post from back in 2009:

### 80 Years of Daily S&P 500 Value-at-Risk Estimates

which was subsequently picked up by Felix Salmon of Reuters in

### How has VaR changed over time?

and Tracy Alloway of Financial Times in

### On baseline VaR

If I find a spare moment, which is rare these days, I’d like to update that analysis and expand it to other markets. A lot has happened since August 2009. Other markets I’d like to look at would include other equity markets as well as fixed income. Due to the ability to cleanly model skew, stable distributions are particularly useful for analyzing fixed income returns.

Written by Eric

October 20, 2012 at 5:02 pm

## Leveraged ETFs: Selling vs Hedging

In this brief note, we’ll compare two similar leveraged ETF strategies. We begin by assuming a portfolio consists of an $x$-times leveraged bull ETF with daily return given by

$R_{\text{Long}} = x R_{\text{Index}} - R_{\text{Fee}},$

where $R_{\text{Fee}}$ is the fee charged by the manager and some cash equivalent with daily return $R_{\text{Cash}}$. The daily portfolio return is given by

\begin{aligned} R_{\text{Portfolio}} &= w_{\text{Long}} R_{\text{Long}} + w_{\text{Cash}} R_{\text{Cash}} \\ &= w_{\text{Long}} \left(x R_{\text{Index}} - R_{\text{Fee}}\right) + w_{\text{Cash}} R_{\text{Cash}}.\end{aligned}

We wish to reduce our exposure to the index.

### Strategy 1

An obvious thing to do to reduce exposure is to sell some shares of the leveraged ETF. In this case, the weight of the ETF is reduced by $\Delta w$ and the weight of cash increases by $\Delta w$. The daily portfolio return is then

$R_{\text{Strategy 1}} = R_{\text{Portfolio}} + \Delta w \left(-x R_{\text{Index}} + R_{\text{Fee}} + R_{\text{Cash}}\right).$

### Strategy 2

Another way to reduce exposure is to buy shares in the leveraged bear ETF. The daily return of the bear ETF is

$R_{\text{Short}} = -x R_{\text{Index}} - R_{\text{Fee}}.$

The daily return of this strategy is

$R_{\text{Strategy 2}} = R_{\text{Portfolio}} + \Delta w \left(-x R_{\text{Index}} - R_{\text{Fee}} - R_{\text{Cash}}\right).$

### Comparison

For most, I think it should be fairly obvious that Strategy 1 is preferred. However, I occasionally come across people with positions in both the bear and bull ETFs. The difference in the daily return of the two strategies is given by

$\Delta R = 2\left(R_{\text{Fee}} + R_{\text{Cash}}\right).$

In other words, if you reduce exposure by buying the bull ETF, you’ll get hit both by fees as well as lost return on your cash equivalent.

Unless you’ve got some interesting derivatives strategy (I’d love to hear about), I recommend not holding both the bear and bull ETFs simultaneously.

Note: I remain long BGU (which is now SPXL) at a cost of US$36 as a long-term investment – despite experts warning against holding these things. It closed yesterday at US$90.92.

Written by Eric

October 2, 2012 at 3:24 pm