Phorgy Phynance

Weighted Likelihood for Time-Varying Gaussian Parameter Estimation

leave a comment »

In a previous article, we presented a weighted likelihood technique for estimating parameters \theta of a probability density function \rho(x|\theta). The motivation being that for time series, we may wish to weigh more recent data more heavily. In this article, we will apply the technique to a simple Gaussian density

\rho(x|\mu,\nu) = \frac{1}{\sqrt{\pi\nu}} \exp\left[-\frac{(x-\mu)^2}{\nu}\right].

In this case, the log likelihood is given by

\begin{aligned} \log\mathcal{L}(\mu,\nu) &= \sum_{i=1}^N w_i \log\rho(x_i|\mu,\nu) \\ &= -\frac{1}{2} \log\pi\nu - \frac{1}{\nu} \sum_{i=1}^N w_i \left(x_i - \mu\right)^2 \end{aligned}.

Recall that the maximum likelihood occurs when

\begin{aligned} \frac{\partial}{\partial\mu} \log\mathcal{L}(\mu,\nu) = \frac{\partial}{\partial\nu} \log\mathcal{L}(\mu,\nu) = 0. \end{aligned}

A simple calculation demonstrates that this occurs when

\begin{aligned} \mu = \sum_{i=1}^N w_i x_i \end{aligned}


\begin{aligned} \sigma^2 = \sum_{i=1}^N w_i \left(x_i - \mu\right)^2, \end{aligned}

where \sigma^2 = \nu/2.

Introducing a weighted expectation operator for a random variable X with samples x_i given by

\begin{aligned} E_w(X) = \sum_{i=1}^N w_i x_i, \end{aligned}

the Gaussian parameters may be expressed in a familiar form via

\mu = E_w(X)


\sigma^2 = E_w(X^2) - \left[E_w(X)\right]^2.

This simple result justifies the use of weighted expectations for time varying Gaussian parameter estimation. As we will see, this is also useful for coding financial time series analysis.



Written by Eric

February 3, 2013 at 4:33 pm

60 GHz Wireless – A Reality Check

with 2 comments

The wireless revolution has been fascinating to watch. Radio (and micro) waves are transforming the way we live our lives. However, I’m increasingly seeing indications the hype may be getting ahead of itself and we’re beginning to have inflated expectations (c/o the hype cycle) about wireless broadband. In this post, I’d like to revisit some of my prior posts on the subject in light of something that has recently come to my attention: 60 GHz wireless.

Wavelength Matters

As I outlined in Physics of Wireless Broadband, the most important property that determines the propagation characteristics of radio (and micro) waves is its wavelength. Technical news and marketing materials about wireless broadband refer to frequency, but there is a simple translation to wavelength (in cm) given by:

\begin{aligned} \lambda(cm) = \frac{30}{f(GHz)}. \end{aligned}

Ages ago when I generated those SAR images, cell phones operated at 900 MHz (0.9 GHz) corresponding to a wavelength of about 33 cm. More recent 3G and 4G wireless devices operate at higher carrier frequencies up to 2.5 GHz corresponding to a shorter wavelength of 12 cm. Earlier this month, the FCC announced plans to release bandwidth at 5 GHz (6 cm).

This frequency creep is partially due to the issues related to the ultimate wireless bandwidth speed limit I outlined, but is also driven by a slight misconception that can be found on Wikipedia:

A key characteristic of bandwidth is that a band of a given width can carry the same amount of information, regardless of where that band is located in the frequency spectrum.

Although this is true from a pure information theoretic perspective, when it comes to wireless broadband, the transmission of information is not determined by Shannon alone. One must also consider Maxwell and there are far fewer people in the world that understand the latter than the former.

The propagation characteristics of 2G radio waves at 900 MHz (33 cm) are already quite different than 3G/4G microwaves at 2.5 GHz (12 cm) not to mention the newly announced 5 GHz (6 cm). That is why I was more than a little surprised to learn that organizations are seriously promoting 60 GHz WiFi. Plugging 60 GHz into our formula gives a wavelength of just 5 mm. This is important for three reasons: 1) Directionality and 2) Penetration, and 3) Diffraction.


As I mentioned in Physics of Wireless Broadband, in order for an antenna to broadcast fairly uniformly in all directions, the antenna length should not be much more than half the carrier wavelength. At 60 GHz, this means the antenna should not be much larger than 2.5 mm. This is not feasible due to the small amount of energy transmitted/received by such a tiny antenna.

Consequently, the antenna would end up being very directional, i.e. it will have preferred directions for transmission/reception, and you’ll need to aim your wireless device toward the router. With the possible exception of being in an empty anechoic chamber, the idea that you’ll be able to carry around a wireless device operating at 60 GHz and maintain a good connection is wishful thinking to say the least.


If directionality weren’t an issue, the transmission characteristics of 60 GHz microwaves alone should dampen any hopes for gigabit wireless at this frequency. Although the physics of transmission is complicated, as a general rule of thumb, the depth at which electromagnetic waves penetrate material is related to wavelength. Early 2G (33 cm) and more recent 3G/4G (12 cm) do a decent job of penetrating walls and doors, etc.

At 60 GHz (5 mm), the signal would be severely challenged to penetrate a paperback novel much less chairs, tables, or cubical walls. As a result, to receive full signal strength, 60 GHz wireless requires direct unobstructed line of sight between the device and router.

Photon Torpedoes vs. Molasses

The more interesting aspects of wireless signal propagation are diffraction and reflection, both of which can be understood via Huygen’s beautiful principle and both of which depend on wavelength. Wireless signals do a reasonably good job of oozing around obstacles if the wavelength is long compared to the size of the obstacle, i.e. at low frequencies. Wireless signal propagation is much better at lower frequencies because the signal can penetrate walls and doors and for those obstacles that cannot be penetrated, you still might receive a signal because the signal can ooze around corners.

As the frequency of the signal increases, the wave stops behaving like molasses oozing around and through obstacles, and begins acting more like photon torpedoes bouncing around the room like particles and shadowing begins to occur. At 60 GHz, shadowing would be severe and communication would depend on direct line of sight or indirect line of sight via reflections. However, it is important to keep in mind that each time the signal bounces off an obstacle, the strength is significantly weakened.

What Does it all Mean?

The idea that we can increase wireless broadband speeds simply by increasing the available bandwidth indefinitely is flawed because you must also consider the propagation characteristics of the carrier frequency. There is only a finite amount of spectrum available that has reasonable directionality, penetration, and diffraction characteristics. This unavoidable inherent physical limitation will lead us eventually to the ultimate wireless broadband speed limit. There is no amount of engineering that can defeat Heisenberg.

There are ways to obtain high bandwidth wireless signals, but you must sacrifice directionality. The extreme would be direct line of sight laser beam communications. Two routers can certainly communicate at gigabit speeds and beyond if they are connected by laser beams. Of course, there can be no obstacles between the routers or the signal will be lost. I can almost imagine a future-esque Star Wars-like communication system where individual mobile devices are, in fact, tracked with laser beams, but I don’t see that ever becoming a practical reality.

We still have some time before we reach this ultimate wireless broadband limit, but to not begin preparing for it now is irresponsible. The only future-proof technology is fiber optics. Communities should avoid the temptation to fore go fiber plans in favor of wireless because those who do so will soon bump into this wireless broadband limit and need to roll out fiber anyway.

Written by Eric

January 21, 2013 at 9:15 am

More fun with maximum likelihood estimation

with one comment

A while ago, I wrote a post

Fun with maximum likelihood estimation

where I jotted down some notes. I ended the post with the following:

Note: The first time I worked through this exercise, I thought it was cute, but I would never compute \mu and \sigma^2 as above so the maximum likelihood estimation, as presented, is not meaningful to me. Hence, this is just a warm up for what comes next. Stay tuned…

Well, it has been over a year and I’m trying to get a friend interested in MLE for a side project we might work on together, so thought it would be good to revisit it now.

To briefly review, the probability of observing N independent samples X\in\mathbb{R}^N may be approximated by

\begin{aligned} P(X|\theta) = \prod_{i = 1}^N P(x_i|\theta) = \left(\Delta x\right)^N \prod_{i=1}^N \rho(x_i|\theta),\end{aligned}

where \rho(x|\theta) is a probability density and \theta represents the parameters we are trying to estimate. The key observation becomes clear after a slight change in perspective.

If we take the Nth root of the above probability (and divide by \Delta x), we obtain the geometric mean of the individual densities, i.e.

\begin{aligned} \langle \rho(X|\theta)\rangle_{\text{geom}} = \prod_{i=1}^N \left[\rho(x_i|\theta)\right]^{1/N}.\end{aligned}

In computing the geometric mean above, each sample is given the same weighting, i.e. 1/N. However, we may have reason to want to weigh some samples heavier than others, e.g. if we are studying samples from a time series, we may want to weigh the more recent data heavier. This inspired me to replace 1/N with an arbitrary weight w_i satisfying

\begin{aligned} w_i\ge 0,\quad\text{and}\quad \sum_{i=1}^N w_i = 1.\end{aligned}

With no apologies for abusing terminology, I’ll refer to this as the likelihood function

\begin{aligned} \mathcal{L}(\theta) = \prod_{i=1}^N \rho(x_i|\theta)^{w_i}.\end{aligned}

Replacing w_i with 1/N would result in the same parameter estimation as the traditional maximum likelihood method.

It is often more convenient to work with log likelihoods, which has an even more intuitive expression

\begin{aligned}\log\mathcal{L}(\theta) = \sum_{i=1}^N w_i \log \rho(x_i|\theta),\end{aligned}

i.e. the log likelihood is simply the weighted (arithmetic) average of the log densities.

I use this approach to estimate stable density parameters for time series analysis that is more suitable for capturing risk in the tails. For instance, I used this technique when generating the charts in a post from back in 2009:

80 Years of Daily S&P 500 Value-at-Risk Estimates

which was subsequently picked up by Felix Salmon of Reuters in

How has VaR changed over time?

and Tracy Alloway of Financial Times in

On baseline VaR

If I find a spare moment, which is rare these days, I’d like to update that analysis and expand it to other markets. A lot has happened since August 2009. Other markets I’d like to look at would include other equity markets as well as fixed income. Due to the ability to cleanly model skew, stable distributions are particularly useful for analyzing fixed income returns.

Leveraged ETFs: Selling vs Hedging

leave a comment »

In this brief note, we’ll compare two similar leveraged ETF strategies. We begin by assuming a portfolio consists of an x-times leveraged bull ETF with daily return given by

R_{\text{Long}} = x R_{\text{Index}} - R_{\text{Fee}},

where R_{\text{Fee}} is the fee charged by the manager and some cash equivalent with daily return R_{\text{Cash}}. The daily portfolio return is given by

\begin{aligned} R_{\text{Portfolio}} &= w_{\text{Long}} R_{\text{Long}} + w_{\text{Cash}} R_{\text{Cash}} \\ &= w_{\text{Long}} \left(x R_{\text{Index}} - R_{\text{Fee}}\right) + w_{\text{Cash}} R_{\text{Cash}}.\end{aligned}

We wish to reduce our exposure to the index.

Strategy 1

An obvious thing to do to reduce exposure is to sell some shares of the leveraged ETF. In this case, the weight of the ETF is reduced by \Delta w and the weight of cash increases by \Delta w. The daily portfolio return is then

R_{\text{Strategy 1}} = R_{\text{Portfolio}} + \Delta w \left(-x R_{\text{Index}} + R_{\text{Fee}} + R_{\text{Cash}}\right).

Strategy 2

Another way to reduce exposure is to buy shares in the leveraged bear ETF. The daily return of the bear ETF is

R_{\text{Short}} = -x R_{\text{Index}} - R_{\text{Fee}}.

The daily return of this strategy is

R_{\text{Strategy 2}} = R_{\text{Portfolio}} + \Delta w \left(-x R_{\text{Index}} - R_{\text{Fee}} - R_{\text{Cash}}\right).


For most, I think it should be fairly obvious that Strategy 1 is preferred. However, I occasionally come across people with positions in both the bear and bull ETFs. The difference in the daily return of the two strategies is given by

\Delta R = 2\left(R_{\text{Fee}} + R_{\text{Cash}}\right).

In other words, if you reduce exposure by buying the bull ETF, you’ll get hit both by fees as well as lost return on your cash equivalent.

Unless you’ve got some interesting derivatives strategy (I’d love to hear about), I recommend not holding both the bear and bull ETFs simultaneously.

Note: I remain long BGU (which is now SPXL) at a cost of US$36 as a long-term investment – despite experts warning against holding these things. It closed yesterday at US$90.92.

Written by Eric

October 2, 2012 at 3:24 pm

Discrete Stochastic Calculus

leave a comment »

This post is part of a series

In the previous post of this series, we found that when Cartesian coordinates are placed on a binary tree, the commutative relations are given by

  • [dx,x] = \frac{(\Delta x)^2}{\Delta t} dt
  • [dt,t] = \Delta t dt
  • [dx,t] = [dt,x] = \Delta t dx.

There are two distinct classes of discrete calculus depending on the relation between \Delta x and \Delta t.

Discrete Exterior Calculus

If we set \Delta x = \Delta t, the commutative relations reduce to

  • [dx,x] = \Delta t dt
  • [dt,t] = \Delta t dt
  • [dx,t] = [dt,x] = \Delta t dx

and in the continuum limit, i.e.  \Delta t\to 0, reduce to

  • [dx,x] = 0
  • [dt,t] = 0
  • [dx,t] = [dt,x] = 0.

In other words, when \Delta x = \Delta t, the commutative relations vanish in the continuum limit and the discrete calculus converges to the exterior calculus of differential forms.

Because of this, the discrete calculus on a binary tree with \Delta x = \Delta t will be referred to as the discrete exterior calculus.

Discrete Stochastic Calculus

If instead of \Delta x = \Delta t, we set (\Delta x)^2 = \Delta t, the commutative relations reduce to

  • [dx,x] = dt
  • [dt,t] = \Delta t dt
  • [dx,t] = [dt,x] = \Delta t dx

and in the continuum limit, i.e.  \Delta t\to 0, reduce to

  • [dx,x] = dt
  • [dt,t] = 0
  • [dx,t] = [dt,x] = 0.

In this case, all commutative relations vanish in the continuum limit except [dx,x] = dt.

In the paper:

I demonstrate how the continuum limit of the commutative relations give rise to (a noncommutative version of) stochastic calculus, where dx plays the role of a Brownian motion.

Because of this, the discrete calculus on a binary tree with (\Delta x)^2 = \Delta t will be referred to as the discrete stochastic calculus.

To date, discrete stochastic calculus has found robust applications in mathematical finance and fluid dynamics. For instance, the application of discrete stochastic calculus to Black-Scholes option pricing was presented in

and the application to fluid dynamics was presented in

Both of these subjects will be addressed in more detail as part of this series of articles.

It should be noted that discrete calculus and its special cases of discrete exterior calculus and discrete stochastic calculus represent a new framework for numerical modeling. We are not taking continuum models built on continuum calculus and constructing finite approximations. Instead, we are building a robust mathematical framework that has finiteness built in from the outset. The resulting numerical models are not approximations, but exact models developed within a finite numerical framework. The framework itself converges to the continuum versions so that any numerical models built within this framework will automatically converge to the continuum versions (if such a thing is desired).

Discrete calculus provides a kind of meta algorithm. It is an algorithm for generating algorithms.

Written by Eric

August 25, 2012 at 12:38 pm

A Note on Discrete Helmholtz Decomposition

with one comment

The following is a note I sent to my PhD advisor, Professor Weng Cho Chew, on September 13, 2011 after a discussion over dinner as he was headed back to UIUC from a 4-year stint as the Dean of Engineering at the University of Hong Kong.

Decomposing Finite Dimensional Inner Product Spaces

Given finite-dimensional inner product spaces U, V  and a linear map A:U\to V, the adjoint map A^\dagger: V\to U is the unique linear map satisfying the property

\langle Au,v\rangle = \langle u,A^\dagger v\rangle

for all u\in U and v\in V.

In this section, we show that V can be decomposed into two orthogonal subspaces

V = \text{im} A\oplus \text{ker} A^\dagger

This is a fairly simple exercise as any finite-dimensional inner product space can be decomposed into a subspace and its orthogonal complement, i.e.

V = \text{im} A \oplus (\text{im} A)^\perp.

The only thing to show is that (\text{im} A)^\perp = \text{ker} A^\dagger.

To do this, note whenever v\in (\text{im} A)^\perp, then

\langle A u, v\rangle = \langle u,A^\dagger v\rangle = 0

for all u\in U. Thus, v is also in \text{ker} A^\dagger, i.e. (\text{im} A)^\perp \subset \text{ker} A^\dagger. Similarly, whenever v\in\text{ker} A^\dagger, then

\langle u,A^\dagger v\rangle = \langle A u, v\rangle = 0

for all u\in U. Thus, v is also in  (\text{im} A)^\perp, i.e. \text{ker} A^\dagger \subset (\text{im} A)^\perp. Since both (\text{im} A)^\perp \subset \text{ker} A^\dagger and \text{ker} A^\dagger \subset (\text{im} A)^\perp, it follows that (\text{im} A)^\perp = \text{ker} A^\dagger.

Hodge-Helmholtz Decomposition

Given finite-dimensional inner product spaces U, V ,W  and linear maps A:U\to V, B:V\to W such that B\circ A = 0, we wish to show that the inner product space V may be decomposed into three orthogonal subspaces

V = \text{im} A\oplus\text{im} B^\dagger\oplus\text{ker} \Delta,

where \Delta = A\circ A^\dagger + B^\dagger\circ B.

To show this, note that if v\in\text{ker}\Delta, then

\langle \Delta v,v\rangle = \langle A^\dagger v,A^\dagger v\rangle + \langle B v, B v\rangle = 0,

but this implies v\in \text{ker} A^\dagger and v\in\text{ker} B. Conversely, if v\in \text{ker} A^\dagger and v\in\text{ker} B, then v  is trivially in \text{ker}\Delta. In other words,

\text{ker} \Delta = \text{ker} A^\dagger \cap \text{ker} B.

Finally, since B\circ A = 0, we also have A^\dagger\circ B^\dagger = 0. Consequently, when v\in\text{im} B^\dagger, then v\in\text{ker} A^\dagger so

\text{im} B^\dagger \subset \text{ker} A^\dagger.

Applying the decomposition from the previous section twice, we conclude that

V = \text{im} A\oplus \text{ker} A^\dagger

and since \text{im} B^\dagger\subset \text{ker} A^\dagger, it follows that

\text{ker} A^\dagger = \text{im} B^\dagger\oplus \text{ker}A^\dagger\cap\text{ker} B

which may be expressed simply as

\text{ker} A^\dagger = \text{im} B^\dagger \oplus \text{ker} \Delta.

Putting this together we see the desired Hodge-Helmholtz decomposition

V = \text{im} A\oplus\text{im} B^\dagger\oplus\text{ker}\Delta.

Computational Electromagnetics

The preceding discussion is quite general and holds for any finite-dimensional inner product spaces U, V, W and any linear maps A:U\to V, B:V\to W satisfying B\circ A = 0. In this section, we specialize to computational electromagnetics.

Consider a discretization of a surface S consisting of N_0 vertices, N_1 directed edges, and N_2 oriented triangular faces. If we associate a degree of freedom to each vertex, the span of these degrees of freedom form an N_0-dimensional vector space V_0. Associating a degree of freedom to each directed edge forms an N_1-dimensional vector space V_1 and associating a degree of freedom to each oriented face forms an N_2-dimensional vector space V_2. For concreteness, vectors in V_0 will be expanded via

\begin{aligned}\phi = \sum_{i=1}^{N_0} \phi_i \mathbf{v}_i \in V_0,\end{aligned}

where \phi_i denotes the degree of freedom on the ith vertex, vectors in V_1 will be expanded via

\begin{aligned}\alpha = \sum_{i=1}^{N_1} \alpha_i \mathbf{e}_i\in V_1,\end{aligned}

where \alpha_i denotes the degree of freedom on the ith directed edge, and vectors in V_2 will be expanded via

\begin{aligned}\beta = \sum_{i=1}^{N_2} \beta_i \mathbf{f}_i\in V_2,\end{aligned}

where \beta_i denotes the degree of freedom on the ith oriented face.

To turn V_0, V_1, and V_2 into inner product spaces, we need to define three respective inner products. This can be done by defining three sets of basis functions B_0, B_1, and B_2. B_0 and B_2 take values defined at vertices and faces, respectively, and maps these to functions defined over each face. Similarly, B_1 takes values defined along each edge and maps these to vector fields defined over each face.

The basis functions linearly turn vectors in V_0, V_1, and V_2 into functions and vector fields defined over the surface  via

\begin{aligned}\phi = \sum_{i=1}^{N_0} \phi_i \mathbf{v}_i\quad\implies\quad B_0(\phi) = \sum_{i=1}^{N_0} \phi_i B_0(\mathbf{v}_i),\end{aligned}

\begin{aligned}\alpha = \sum_{i=1}^{N_1} \alpha_i \mathbf{e_i} \quad\implies\quad B_1(\alpha) = \sum_{i=1}^{N_1} \alpha_i B_1(\mathbf{e}_i),\end{aligned}


\begin{aligned}\beta = \sum_{i=1}^{N_2} \beta_i \mathbf{f}_i \quad\implies\quad B_2(\beta) = \sum_{i=1}^{N_2} \beta_i B_2(\mathbf{f}_i).\end{aligned}

The inner products may then be defined in terms of basis functions via

\begin{aligned}(M_0)_{i,j} = \langle\mathbf{v}_i,\mathbf{v}_j\rangle_0 = \int_S B_0(\mathbf{v}_i) B_0(\mathbf{v}_j) dA,\end{aligned}

\begin{aligned}(M_1)_{i,j} = \langle\mathbf{e}_i,\mathbf{e}_j\rangle_1 = \int_S B_1(\mathbf{e}_i)\cdot B_1(\mathbf{e}_j) dA,\end{aligned}


\begin{aligned}(M_2)_{i,j} = \langle \mathbf{f}_i,\mathbf{f}_j\rangle_2 = \int_S B_2(\mathbf{f}_i) B_2(\mathbf{f}_j) dA.\end{aligned}

Letting [\phi], [\alpha], and [\beta] denote column matrix representations of vectors in V_0, V_1, and ,V_2 the inner products may be expressed in terms of matrix-vector products via

\langle \phi,\phi'\rangle_0 = [\phi]^t [M_0] [\phi'],

\langle \alpha,\alpha'\rangle_1 = [\alpha]^t [M_1] [\alpha'],


\langle \beta,\beta'\rangle_2 = [\beta]^t [M_2] [\beta'].

The matrix-vector representation is helpful for explicitly expressing the adjoint of a linear map A:V_0\to V_1 via

\langle A\phi,\alpha\rangle_1 = [\phi]^t[M_0]\left([M_0]^{-1} [A]^t [M_1] \right) [\alpha] = [\phi]^t [M_0] [A^\dagger] [\alpha] = \langle \phi,A^\dagger \alpha\rangle_0

so that

[A^\dagger] = [M_0]^{-1} [A]^t [M_1].

Similarly, the adjoint of a linear map B:V_1\to V_2 may be represented in matrix form via

[B^\dagger] = [M_1]^{-1} [B]^t [M_2].

In computational electromagnetics, a fundamental linear map is the exterior derivative, which will be denoted d_i:V_i\to V_{i+1} for i = 0,1. Since V_0, V_1, and V_2 are finite dimensional, d_i has a sparse matrix representation [d_i].

For the sake of interpretation, the matrix [d_0] may be thought of as the gradient along the respective directed edge, [d_1] may be thought of as the curl of the edge vector field around each oriented face, [d_1^\dagger] may be thought of as the transverse gradient[1] across each directed edge, and [d_0^\dagger] may be thought of as the divergence of the edge vector field.

Critically note,

[d_1][d_0] = 0.

As a result, we have the inner product space of edge vector fields V_1 decomposes into

V_1 = \text{im} d_0\oplus \text{im} d_1^\dagger\oplus \text{ker} \Delta,

where \Delta = d_0\circ d_0^\dagger + d_1^\dagger\circ d_1. In other words, any edge vector v\in V_1 may be expressed as

v = d_0\phi + d_1^\dagger\beta + h

for some \phi\in V_0, \beta\in V_2, and h\in \text{ker} \Delta. The above may be thought of as a discrete version of Hodge-Helmholtz decomposition for computational electromagnetics.


This note is an informal (and quickly drafted) document intended to help explain Hodge-Helmholtz decomposition in computational electromagnetics. No claim of any original content is intended and a proper literature search was not performed. For pointers to some related material with more complete references, see the following:

[1] If the degree of freedom associated to an oriented face is interpreted as the magnitude of vector normal to the face,  may be thought of as the curl of this normal vector field along the directed edge.

Written by Eric

March 17, 2012 at 6:36 pm

Posted in Uncategorized

Network Theory and Discrete Calculus – Coordinates

with 3 comments

This post is part of a series

When the binary tree was presented in the context of discrete calculus, the following small section of the tree was illustrated to establish the way the nodes are labelled

Two sets of coordinates were introduced on the binary tree:

  1. Cartesian coordinates (x,t)
  2. Graph coordinates (u^+,u^-)

The following illustrates the binary tree when we zoom out a bit:

Cartesian coordinates were defined such that

  • x(i\pm 1,j+1) = x(i,j) \pm \Delta x, and
  • t(i\pm 1,j+1) = t(i,j) + \Delta t

resulting in

  • dx = \Delta x (\mathbf{e}^+ - \mathbf{e}^-), and
  • dt = \Delta t (\mathbf{e}^+ + \mathbf{e}^-).

Although Cartesian coordinates often help to relate discrete calculus to continuum calculus, the expressions are often not the most natural to work with when performing computations. One reason for this can be understood by overlaying the Cartesian coordinate lines onto the binary tree.

On the other hand, graph coordinates are defined such that

  • u^+(i+1,j+1) = u^+(i,j) + \Delta u^+,
  • u^+(i-1,j+1) = u^+(i,j),
  • u^-(i+1,j+1) = u^-(i,j),
  • u^-(i-1,j+1) = u^-(i,j) + \Delta u^-

resulting in

  • du^+ = \Delta u^+ \mathbf{e}^+, and
  • du^- = \Delta u^- \mathbf{e}^-.

Computations are often cleaner when using graph coordinates. One reason for this can be understood by overlaying the graph coordinate lines onto the binary tree.

For instance, the commutative relations in graph coordinates are given by

  • [du^\pm,u^\pm] = \Delta u^\pm du^\pm
  • [du^\pm,u^\mp] = 0

whereas the commutative relations for Cartesian coordinates are given by

  • [dx,x] = \frac{(\Delta x)^2}{\Delta t} dt
  • [dt,t] = \Delta t dt.
  • [dx,t] = [dt,x] = \Delta t dx.

The cross commutative relations between the two sets of coordinates are given by

  • [du^\pm,x] = [dx,u^\pm] = \pm\Delta x du^\pm
  • [du^\pm,t] = [dt,u^\pm] = \Delta t du^\pm

As a final note, for any discrete 0-form \phi, the above indicates that

d\phi = \frac{1}{\Delta t} [dt,\phi].

Written by Eric

February 19, 2012 at 2:17 pm