Methods of Empirical Finance (UE)

class: title-slide, center, middle

# Methods of Empirical Finance

## Seminar (UE)

### Christoph Huber

### University of Innsbruck

#### Master in Banking and Finance

### Winter term 2019/20 (this version: 2019-11-19)

---
class: inverse, center, middle

# Course outline (Syllabus)

### Methods of Empirical Finance

---
<footer>
<p>UE Methods of Empirical Finance | Christoph Huber</p>
</footer>

# Objective

- this course provides an introduction to basic methodological concepts, methods, and models commonly applied in economics and finance, their weaknesses and strengths, as well as their fields of application

- by focusing on how to choose, apply, and interpret different methods and models, the course provides the essential knowledge to conduct empirical research on your own — especially with respect to data analysis and methodological issues for your master’s thesis

---
<footer></footer>

# Approach

- (__Lecture, VO__) the course tries to present just as much theory as necessary to understand what you are doing and to provide you a sufficient basis to broaden your knowledge in empirical finance, more sophisticated methods and models, and statistical data applications on your own

- (__Seminar, UE__) in a hands-on approach, methodological and theoretical concepts are applied to answer research questions from the field of empirical finance using an appropriate statistical software package

---
<footer></footer>

# Grading

- grades in the seminar (UE) are based on two assignments to be conducted in pairs; more detailed information on the problem sets will be provided on time

- to pass the seminar with a positive grade, at least 50% of the points in each of the two assignments must be reached: assignments not handed-in in time will result in a negative seminar grade

---
<footer></footer>

# Grading

- the following grading key will be applied:

|Points             |Grade|
|---                |---|
| `$< 50.0\%$`        | ‘_deficient_’ (5) |
| `$50.0\% - 62.5\%$` | '_sufficient_’ (4)|
| `$62.5\% – 75.0\%$` | ‘_satisfactory_’ (3)|
| `$75.0\% – 87.5\%$` | ‘_good_’ (2)|
| `$>87.5\%$`         | ‘_very good_’ (1)|

- overall course grades equal the ECTS weighted average of the grades in the lecture (VO) and the seminar (UE),
  i.e. 0.6 × VO-grade + 0.4 × UE-grade

---
<footer></footer>

# Dates

- the seminar will take place at the following times:

|Date/Time                    |Room|
|---                          |---|
|Tu 19.11.2019, 10:00-12:45   |AR 4 (ZID)|
|Tu 26.11.2019, 10:00-12:45   |AR 4 (ZID)|
|Tu 03.12.2019, 10:00-12:45   |AR 4 (ZID)|
|Tu 10.12.2019, 10:00-11:45   |AR 4 (ZID)|
|Tu 07.01.2020, 10:00-11:45   |AR 4 (ZID)|

- the assignments have to be handed in by the following times:

|Date/Time                    |Assignment|
|---                          |---|
|Fr 06.12.2019, 23:59         |Assignment 1|
|Fr 17.01.2020, 23:59         |Assignment 2|

---
class: center, middle
<footer></footer>

# Contact

__ Christoph Huber, MSc __

Department of Banking and Finance<br>
University of Innsbruck<br>
Universitätsstraße 15, 6020 Innsbruck<br>
4<sup>th</sup> floor, room o.4.08

e-mail: christoph.huber@uibk.ac.at<br>
phone: +43-(0)512-507-73015

[chr-huber.com](https://chr-huber.com)

---
class: center, middle, inverse

# Introduction

### Methods of Empirical Finance

---
class: middle

# What are we going to cover in the seminar?

- selected topics, which...

- ... relate to the lecture (VO)

- ... I think are _interesting_ and/or _useful_ <br>
  (e.g. for future seminar papers, your Master's thesis, etc.)

---
<footer></footer>

# Software

- you will use statistical software packages 
  - in the seminar
  - in the assignments

to __analyze empirical data__
  
- you are free to use the software you are most familiar with,
  _as long as_ the software package allows for _writing scripts_!

- suitable statistical software packages are, among many others:<br>
  _R_, _Stata_, _Eviews_, _Matlab_, etc.
  
- however, _I_ will mainly use __R__

---
<footer></footer>

# Documentation

- note that in analyzing empirical data, it is important to __ _document_ each step__ of the analysis - from loading the data into the statistical software package to transforming the data and creating tables and figures - for replicability purposes
  
- possible ways to document each step of the analysis are:

- Notebooks (e.g. _R Markdown_, _Jupyter_, etc.)
  - Annotated code (in the `.R`- or `.do`-script etc.)

---
class: center, middle, inverse

# Introduction to R

### Methods of Empirical Finance

---
<footer></footer>

# Introduction to R

- R project homepage: https://www.R-project.org/
- Open-source software project, GNU General Public License (GPL).
- Comprehensive R Archive Network (CRAN):
  https://CRAN.R-project.org
  
### Installation

- Go to CRAN, pick up the version for your operating
system, follow instructions in readme file.

.smaller[
- Microsoft Windows: Download and run setup `.exe` file.

- Mac OS X: Installer package .pkg for base system and
platform-specific GUI, along with additional programming
tools (as disk image `.dmg` files).

- Linux: Pre-packaged binaries for various flavors (`.deb` or
`.rpm` files), also interfaced in various update managers
(_apt_, _yum_, etc.).
]
---
<footer></footer>

# Introduction to R

### R as a Calculator

```r
1 + 1
```

```
## [1] 2
```

```r
2^3
```

```
## [1] 8
```

.smaller[
__Mathematical functions__: e.g. `log()`, `exp()`, `sin()`,
`asin()`, `cos()`, `acos()`, `tan()`, `atan()`, `sign()`, `sqrt()`, `abs()`,
`min()`, `max()`, ...
]

```r
log(exp(sin(pi/4)^2) * exp(cos(pi/4)^2))
```

```
## [1] 1
```

---
<footer></footer>

# Introduction to R

### Vector arithmetic

.pull-left[
.smaller[
__Generation of vectors:__ e.g., via `c()`:
]

```r
x <- c(1.8, 3.14, 4, 88.169, 13)
length(x)
```

```
## [1] 5
```

.smaller[
__Assignment operators:__ `<-` or `=`

__Subsets of vectors:__
]

```r
x[c(1, 4)]
```

```
## [1]  1.800 88.169
```

]
.pull-right[
.smaller[
__Examples:__
]

```r
2 * x + 3
```

```
## [1]   6.600   9.280  11.000 179.338  29.000
```

```r
5:1 * x + 1:5
```

```
## [1]  10.000  14.560  15.000 180.338  18.000
```

```r
log(x)
```

```
## [1] 0.5877867 1.1442228 1.3862944 4.4792554 2.5649494
```

]

---
<footer></footer>

# Introduction to R

### Data management

.pull-left[

```r
mydata <- data.frame(one = 1:10, 
                     two = 11:20, 
                     three = 21:30)

mydata
```

```
##    one two three
## 1    1  11    21
## 2    2  12    22
## 3    3  13    23
## 4    4  14    24
## 5    5  15    25
## 6    6  16    26
## 7    7  17    27
## 8    8  18    28
## 9    9  19    29
## 10  10  20    30
```

]

.pull-right[
.smaller[
__Data frames:__ Basic data structure in R.
]

__Select columns:__

```r
mydata$two
```

```
##  [1] 11 12 13 14 15 16 17 18 19 20
```

```r
mydata[, "two"]
```

```
##  [1] 11 12 13 14 15 16 17 18 19 20
```

```r
mydata[, 2]
```

```
##  [1] 11 12 13 14 15 16 17 18 19 20
```

]

---
<footer></footer>

# Introduction to R

### Data management

.pull-left[
.smaller[Import]

```r
# Plain text
newdata <- read.table("mydata.txt",
                      header = TRUE)
```

```r
# Excel spreadsheet (.xls, .xlsx)
library(readxl)
newdata <- read_excel("mydata.xls")
```

```r
# Stata files
library(foreign)
newdata <- read.dta("mydata.dta")
```
]

.pull-right[
.smaller[Export]

```r
# Plain text
write.table(mydata, file = "mydata.txt",
            col.names = TRUE)
```

```r
# Stata files
library(foreign)
write.dta(mydata, file = "mydata.dta")
```

.smaller[__R format__]

```r
save(mydata, file = "mydata.rda")
load("mydata.rda")
```

]

---
<footer></footer>

# Introduction to R

### Data management

.smaller[
__Factors:__

Categorical information is stored in _factors_, e.g. gender, ethnicity, species, etc.

]

```r
g <- rep(0:1, c(2, 4))
g <- factor(g, levels = 0:1, labels = c("male", "female"))
g
```

```
## [1] male   male   female female female female
## Levels: male female
```

---
<footer></footer>

# Introduction to R

### Data management

.smaller[
__Missing values:__

Missing values are coded as `NA` (for "not available").
For many functions you can use the option `na.rm=TRUE` to ignore missing values.
E.g.:
.left-column-reverse[

```r
x <- c(4, 7, 3, 2, NA, 16, NA, 8)

is.na(x)        # shows for each data point whether it is NA
```

```
## [1] FALSE FALSE FALSE FALSE  TRUE FALSE  TRUE FALSE
```

```r
sum(is.na(x))   # calculates the sum of all missing values
```

```
## [1] 2
```
]

.right-column-reverse[
`x` contains two missing values
]

.pull-left[

```r
mean(x)
```

```
## [1] NA
```
]
.pull-right[

```r
mean(x, na.rm=TRUE)
```

```
## [1] 6.666667
```
]
]

---
<footer></footer>

# Introduction to R

### Packages

.smaller[
Installing and loading packages:

- If connected to the internet, simply type
`install.packages("partykit")` for installing _partykit_.
- Additionally for Windows, Mac, RStudio: GUI installer
menus.
- Packages are installed in _libraries_ (= collections of
packages).
- Library paths can be specified (see `?library`).
- Packages are loaded by the command `library()`, e.g.,
`library("partykit")`.
- `library()` lists all currently installed packages.

__CRAN task views:__ Overview of packages for certain tasks
(e.g., environmetrics, psychometrics, time series, . . . ).<br>
https://CRAN.R-project.org/web/views/
]

---
class: middle, inverse

# Your turn

- Load the data in file `sp500_data.csv` into your statistical software package

- This file contains data on companies in the _S&P 500 Index_ from _Bloomberg_

<br>

- Get an overview about the available data

- Prepare _descriptive statistics_ about the data at hand

- Try to prepare summary statistics by industry/sector

- Try to prepare figures describing one or more variables in the data set you
  find interesting

<br>

- Make sure to use a `script` and _document_ how you solved the task