The Vanguard Total Stock Market ETF (NYSEARCA: VTI) tracks the performance of the CRSP U.S. Total Market Index. The fund has returned 6.23% since its inception in 2001. The fund is a market capitalization-weighted index that measures the entire investable U.S. equities market. It includes small-, mid- and large-cap companies. The fund is managed in a passive manner and uses an index-sampling strategy.

```
library(quantmod)
require(RCurl)
require(foreign)
x <- getURL("https://raw.githubusercontent.com/RInterested/datasets/gh-pages/VTI%20historical%20data.csv")
VTI <- read.csv(text = x, sep =",")
VTI$Date <- as.Date(VTI$Date, format = "%Y-%m-%d")
rownames(VTI) <- VTI$Date
VTI$Date <- NULL # Removing the Date column
VTI <- as.xts(VTI)
chartSeries(VTI, type="auto", theme=chartTheme('white'), yrange = c(0,180))
```

Notice the split of the shares reflected as the broad band on the left of the plot:

Vanguard Total Stock Market ETF (VTI) has \(1\) split in our VTI split history database. The split for VTI took place on

June \(18\), 2008. This was a \(2\) for \(1\) split, meaning for each share of VTI owned pre-split, the shareholder now owned \(2\) shares.

```
z = as.data.frame(VTI$Adj.Close) # Subsetting closing price
# Function to calculate % change in closing price between days:
D2D = function (x) {
days = nrow(x)
delta = numeric(days)
for(i in 2:days){
delta[i] <- (100*((x[i,1] - x[i - 1,1])/(x[i - 1,1])))
}
delta
}
VTI$InterDay = D2D(z) # Included as add'l column to VTI.
head(VTI$InterDay) # A zero in entry 1 spells trouble...
```

```
## InterDay
## 2001-06-15 0.0000000
## 2001-06-18 -0.6377436
## 2001-06-19 0.2802368
## 2001-06-20 1.0547174
## 2001-06-21 0.8028564
## 2001-06-22 -0.6371699
```

```
#... we need something to fill in the 0 in row 1. Why not the 2nd value?
VTI$InterDay[1]<-VTI$InterDay[2]
summary(VTI$InterDay)
```

```
## Index InterDay
## Min. :2001-06-15 00:00:00 Min. :-9.35295
## 1st Qu.:2005-06-14 18:00:00 1st Qu.:-0.47824
## Median :2009-06-09 12:00:00 Median : 0.07394
## Mean :2009-06-08 08:19:59 Mean : 0.02713
## 3rd Qu.:2013-06-04 06:00:00 3rd Qu.: 0.56837
## Max. :2017-05-26 00:00:00 Max. :12.82977
```

So the maximum loss between days has been \(-9.4\%\), and the maximum gain \(12.8\%\). When did these events occur?

`time(VTI)[VTI$InterDay == min(VTI$InterDay)]`

`## [1] "2008-10-15 EDT"`

`time(VTI)[which.max(VTI$InterDay)]`

`## [1] "2008-10-13 EDT"`

… both during the financial crisis of 2008.

How often does a drop of say, \(-4\%\) occurs?

`sum(VTI$InterDay < -4)`

`## [1] 28`

`mean(VTI$InterDay < -4) * 100 # or 0.7% of the days`

`## [1] 0.6979063`

`sum(VTI$InterDay < -4) / 16 # or between 1 and 2 days every year.`

`## [1] 1.75`

… and what about gains of \(4\%\)?

`sum(VTI$InterDay > 4)`

`## [1] 23`

`mean(VTI$InterDay > 4) * 100 # or 0.6% of the days`

`## [1] 0.5732802`

`sum(VTI$InterDay > 4) / 16 # or between 1 and 2 days every year?`

`## [1] 1.4375`

But they can’t be distributed uniformly… Let’s first look at the losing days (to the left), and compare to winning days:

```
par(mfrow=c(1,2))
plot(time(VTI), VTI$InterDay < -4, col=2, type='h', xaxt="n",
xlab="Year", ylab="Days with > - 4% change",
cex.axis=.5, cex.main=.8, cex.lab =.5,las=2,
main = "Clustering of big drop days")
# Selecting the time information in the names of the rows of VTI:
tt = time(VTI)
# We select points spaced by approximately 1 business year:
# 365 days - 2 days off each weekend - 9 Holidays in the USA
ix = seq(1, nrow(VTI), by = 365 - 2*4*12 - 9)
# Formatting the labels as just simply the year with 4 digits:
fmt = "%Y"
# Generating vector of potential labels:
labs = format(tt, fmt)
# Plotting the x axis:
axis(side = 1, at = tt[ix], labels = labs[ix], cex.axis = 0.7, las = 2)
plot(time(VTI), VTI$InterDay > 4, col=3, type='h', xaxt="n",
xlab="Year", ylab="Days with > 4% gains",
cex.axis=.5, cex.main=.8, cex.lab =.5,las=2,
main = "Clustering of big winning days")
# Selecting the time information in the names of the rows of VTI:
tt = time(VTI)
# We select points spaced by approximately 1 business year:
# 365 days - 2 days off each weekend - 9 Holidays in the USA
ix = seq(1, nrow(VTI), by = 365 - 2*4*12 - 9)
# Formatting the labels as just simply the year with 4 digits:
fmt = "%Y"
# Generating vector of potential labels:
labs = format(tt, fmt)
# Plotting the x axis:
axis(side = 1, at = tt[ix], labels = labs[ix], cex.axis = 0.7, las = 2)
```

Wow! Huge spike around 2008 - 2009. How much clustering is there?

`sum(VTI["2008/2009"]$InterDay < - 4) # Total number of drops > -4%`

`## [1] 22`

```
# As a percentage of all losing days...
sum(VTI["2008/2009"]$InterDay < - 4) / sum(VTI$InterDay < -4)
```

`## [1] 0.7857143`

What about wins during the same period?

`sum(VTI["2008/2009"]$InterDay > 4) # Total number of winning days > 4%`

`## [1] 15`

```
# As a percentage of all winning days...
sum(VTI["2008/2009"]$InterDay > 4) / sum(VTI$InterDay > 4)
```

`## [1] 0.6521739`

So, during the crisis there were \(22\) “big losing” days, and \(15\) “big winning days”.

```
plot(VTI$InterDay["2008-09-01/2009-03-01"], las=2, cex.axis=.6, main="Interday changes during crisis")
abline(h = 0)
```

Yet, the market tanked. Were the losing days worse, not just more numerous?

The maximum drop did occur during the crisis…

`min(VTI["2008/2009"]$InterDay)`

`## [1] -9.352953`

when?

`time(VTI["2008/2009"])[which.min(VTI["2008/2009"]$InterDay)]`

`## [1] "2008-10-15 EDT"`

Here is the news flashback:

Another huge Dow lossBlue-chip indicator drubbed \(733\) points - \(2\)nd biggest point loss ever - as recession fears resurface.By Alexandra Twin, CNNMoney.com senior writer

Last Updated:

October 15, 2008: 6:21 PM ETNEW YORK (CNNMoney.com) – Recession talk scared Wall Street Wednesday, sending the Dow Jones industrial average to its second biggest one-day point loss ever.

A weak retail sales report and dour forecasts from the Federal Reserve, coupled with sober comments from Fed Chairman Ben Bernanke, sent stocks tumbling.

a whopping \(-9\%\)! Incredibly the maximum inter-day win also happen around this period…

`max(VTI["2008/2009"]$InterDay)`

`## [1] 12.82977`

Wow! When did it happen?

`time(VTI["2008/2009"])[which.max(VTI["2008/2009"]$InterDay)]`

`## [1] "2008-10-13 EDT"`

`VTI["2008-10-13"]`

```
## Open High Low Close Adj.Close Volume InterDay
## 2008-10-13 47.05 50.23 46.66 50.04 50.04 7777500 12.82977
```

and the news reflected it:

Dow jumps \(936\) points and S&P up \(104\), in the biggest point gains ever. The Dow, S&P and Nasdaq all gain over \(11\%\).By Alexandra Twin, CNNMoney.com senior writer

Last Updated:

October 13, 2008: 6:15 PM ETNEW YORK(CNNMoney.com)

Stocks rallied Monday afternoon, with the Dow rallying \(976\) points during the session, as investors bet that the worst of the credit crisis is over, following a series of global initiatives announced over the last few days.

That bounce was predicated by the day’s news, with investors breathing a sigh of relief that some specifics were finally released regarding the \(\$700\) billion bank bailout.

Let’s hone down on the crisis period:

```
crisis = VTI["2008-09-01/2013-03-01"]
plot(time(crisis), crisis$InterDay < - 4, col=2, type='h',
xlab="Year", ylab="",
cex.axis=.5, cex.main=.8, cex.lab =.5,las=2,
main = "Big losing and winning days during the crisis")
points(time(crisis), crisis$InterDay > 4, col=3, type='h')
```

Between the start of the crisis around September 2008, and the nadir, on March 09, 2009,

A year later, we know that March 9 was the bottom of a months-long financial panic that wiped away trillions of dollars in assets. But on what now appears to have been the best buying opportunity of a generation, many only wondered how much lower the markets would tumble. (Forbes)

For Dow, another \(12\)-year low.

S&P also finishes at lowest level in more than a decade as Wall Street resumes its retreat on economic worries.

By Alexandra Twin, CNNMoney.com senior writer Last Updated: March 9, 2009: 6:18 PM ET

there was a loss of

`(drop = as.numeric(VTI$Adj.Close["2009-03-09"]) - as.numeric(VTI$Adj.Close["2008-09-10"]))`

`## [1] -28.22`

`drop/as.numeric(VTI$Adj.Close["2008-09-10"]) * 100 # In percentage...`

`## [1] -45.57493`

which didn’t recover fully until March 2013, four years later:

What happened on October 10, 2009? Not much, it was Saturday. And on Monday

`VTI["2009-10-12"]`

```
## Open High Low Close Adj.Close Volume InterDay
## 2009-10-12 54.63 54.84 54.38 54.6 54.6 868000 0.3676397
```

not much happened, either… October 14, 2009, perhaps?

`VTI["2009-10-14"]`

```
## Open High Low Close Adj.Close Volume InterDay
## 2009-10-14 55.11 55.48 54.87 55.42 55.42 1322200 1.781445
```

Finally, let’s take a look at the fat tails of VTI as representative of the market, and as compared to the normal distribution:

```
hist(VTI$InterDay,main="INTERDAY VTI % CHANGES: FAT TAILS",sub="fitted normal (in purple); pdf estimate (in blue)",xlab = "PERCENTAGE CHANGE FROM DAY TO DAY",
ylab = "FREQUENCY", prob = TRUE, col ="coral", ylim=c(0,0.5),
xlim=c(-4,4), cex.main=.9, cex.sub=.8, cex.lab=.7, border=F,
breaks = 33)
#Add density estimate
lines(density(VTI$InterDay,adjust=7),col="darkred",lwd=2) #Prettier, adjusted pdf estimate
sd = sd(VTI$InterDay)
m = mean(VTI$InterDay)
curve(dnorm(x,mean=m,sd=sd),col="grey60",lwd=2,add=T,yaxt="n")
lines(density(VTI$InterDay,adjust=7),col="darkred",lwd=2) #Prettier, adjusted pdf estimate
```

But let’s objectivize the lack of normality:

```
qqnorm(VTI$InterDay, pch=19, cex=.1, col=rgb(0,0,1,0.5), cex.main=.8, cex.lab=.7, cex.axis=.7)
qqline(VTI$InterDay, col=rgb(0,0,1,0.8))
```

`shapiro.test(as.vector(VTI$InterDay))`

```
##
## Shapiro-Wilk normality test
##
## data: as.vector(VTI$InterDay)
## W = 0.90896, p-value < 2.2e-16
```