--- title: Introductory analysis of daily precipitation with hydroTSM author: - Mauricio Zambrano-Bigiarini^[mauricio.zambrano@ufrontera.cl] date: " version 0.9, 17-Jan-2024" output: pdf_document: number_sections: yes html_document: df_print: paged number_sections: yes vignette: | %\VignetteIndexEntry{Introductory analysis of daily precipitation with hydroTSM} %\VignetteKeyword{hydrology} %\VignetteKeyword{hydrological modelling} %\VignetteEncoding{UTF-8} %\VignetteEngine{knitr::rmarkdown} --- ```{r setup, include = FALSE} knitr::opts_chunk$set(echo = TRUE) ``` # Installation Installing the latest stable version (from [CRAN](http://cran.r-project.org/web/packages/hydroTSM/)): ```{r installation1, eval=FALSE} install.packages("hydroTSM") ``` \noindent Alternatively, you can also try the under-development version (from [Github](https://github.com/hzambran/hydroTSM)): ```{r installation2, eval=FALSE} if (!require(devtools)) install.packages("devtools") library(devtools) install_github("hzambran/hydroTSM") ``` # Setting up the environment Loading the *hydroTSM* package, which contains data and functions used in this analysis: ```{r LoadingPkg} library(hydroTSM) ``` Loading daily precipitation data at the station San Martino di Castrozza, Trento Province, Italy, from 01/Jan/1921 to 31/Dec/1990. ```{r LoadingData} data(SanMartinoPPts) ``` Selecting only a 6-years time slice for the analysis ```{r Window1} x <- window(SanMartinoPPts, start="1985-01-01") ``` Dates of the daily values of 'x' ```{r Dates} dates <- time(x) ``` Amount of years in 'x' (needed for computations) ```{r yip} ( nyears <- yip(from=start(x), to=end(x), out.type="nmbr" ) ) ``` # Basic exploratory data analysis (EDA) 1) Summary statistics ```{r smry} smry(x) ``` 2) Amount of days with information (not NA) per year ```{r dwi1} dwi(x) ``` 3) Amount of days with information (not NA) per month per year ```{r dwi2} dwi(x, out.unit="mpy") ``` 4) Computation of monthly values only when the percentage of NAs in each month is lower than a user-defined percentage (10% in this example). ```{r Custom_daily2monthly} # Loading the DAILY precipitation data at SanMartino data(SanMartinoPPts) y <- SanMartinoPPts # Subsetting 'y' to its first three months (Jan/1921 - Mar/1921) y <- window(y, end="1921-03-31") ## Transforming into NA the 10% of values in 'y' set.seed(10) # for reproducible results n <- length(y) n.nas <- round(0.1*n, 0) na.index <- sample(1:n, n.nas) y[na.index] <- NA ## Daily to monthly, only for months with less than 10% of missing values (m2 <- daily2monthly(y, FUN=sum, na.rm=TRUE, na.rm.max=0.1)) # Verifying that the second and third month of 'x' had 10% or more of missing values cmv(y, tscale="month") ``` 4) Basic exploratory figures: Using the *hydroplot* function, which (by default) plots 9 different graphs: 3 ts plots, 3 boxplots and 3 histograms summarizing 'x'. For this example, only daily and monthly plots are produced, and only data starting on 01-Jan-1987 are plotted. ```{r hydroplot, fig.width=10, fig.height=8} hydroplot(x, var.type="Precipitation", main="at San Martino", pfreq = "dm", from="1987-01-01") ``` Global view of daily precipitation values a calendar heatmap (six years maximum), useful for visually identifying dry, normal and wet days: ```{r calendarHeatmap, fig.width=8.6, fig.height=8.7} calendarHeatmap(x) ``` For each month, the previous figure is read from top to bottom. For example, January 1st 1987 was Thursday, January 31th 1987 was Saturday and November 1st 1990 was Thursday. Selecting only a three-month time slice for the analysis: ```{r Window3} yy <- window(SanMartinoPPts, start="1990-10-01") ``` Plotting the selected time series: ```{r hydroplot3} hydroplot(yy, ptype="ts", pfreq="o", var.unit="mm") ``` # Annual analysis Annual values of precipitation ```{r daily2annual} daily2annual(x, FUN=sum, na.rm=TRUE) ``` Average annual precipitation Obvious way: ```{r daily2annual2} mean( daily2annual(x, FUN=sum, na.rm=TRUE) ) ``` Another way (more useful for streamflows, where `FUN=mean`): The function *annualfunction* applies `FUN` twice over `x`: ( i) firstly, over all the elements of `x` belonging to the same year, in order to obtain the corresponding annual values, and (ii) secondly, over all the annual values of `x` previously obtained, in order to obtain a single annual value. ```{r annualfunction} annualfunction(x, FUN=sum, na.rm=TRUE) / nyears ``` # Monthly analysis 1) Plotting the monthly precipitation values for each year, useful for identifying dry/wet months. ```{r matrixplot} # Daily zoo to monthly zoo m <- daily2monthly(x, FUN=sum, na.rm=TRUE) # Creating a matrix with monthly values per year in each column M <- matrix(m, ncol=12, byrow=TRUE) colnames(M) <- month.abb rownames(M) <- unique(format(time(m), "%Y")) # Plotting the monthly precipitation values require(lattice) print(matrixplot(M, ColorRamp="Precipitation", main="Monthly precipitation at San Martino st., [mm/month]")) ``` 2) Median of the monthly values at station 'x'. Not needed, just for looking at these values in the boxplot. ```{r monthlyfunction} monthlyfunction(m, FUN=median, na.rm=TRUE) ``` 3) Vector with the three-letter abbreviations for the month names ```{r cmonth} cmonth <- format(time(m), "%b") ``` 4) Creating ordered monthly factors ```{r months} months <- factor(cmonth, levels=unique(cmonth), ordered=TRUE) ``` 5) Boxplot of the monthly values ```{r boxplotMonthly} boxplot( coredata(m) ~ months, col="lightblue", main="Monthly Precipitation", ylab="Precipitation, [mm]", xlab="Month") ``` # Seasonal analysis Average seasonal values of precipitation ```{r seasonalfunction} seasonalfunction(x, FUN=sum, na.rm=TRUE) / nyears ``` Extracting the seasonal values for each year ```{r dm2seasonal} ( DJF <- dm2seasonal(x, season="DJF", FUN=sum) ) ( MAM <- dm2seasonal(m, season="MAM", FUN=sum) ) ( JJA <- dm2seasonal(m, season="JJA", FUN=sum) ) ( SON <- dm2seasonal(m, season="SON", FUN=sum) ) ``` Plotting the time evolution of the seasonal precipitation values ```{r hydroplot2, fig.width=12, fig.height=10} hydroplot(x, pfreq="seasonal", FUN=sum, stype="default") ``` # Some extreme indices Common steps for the analysis of this section: Loading daily precipitation data at the station San Martino di Castrozza, Trento Province, Italy, with data from 01/Jan/1921 to 31/Dec/1990. ```{r LoadingData2} data(SanMartinoPPts) ``` Selecting only a 6-year time slice for the analysis ```{r Window4} x <- window(SanMartinoPPts, start="1985-10-01") ``` Plotting the selected time series ```{r hydroplot4} hydroplot(x, ptype="ts", pfreq="o", var.unit="mm") ``` ## Seasonality index Computing the seasonality index defined by Walsh and Lawler (1981) to classify the precipitation regime of `x`: ```{r SeasonalityIndex} si(x) ``` According to the seasonality index defined by Walsh and Lawler (1981), a value of 0.35 corresponds to a precipitation regime that can be classified as "Equable but with a definite wetter season" (see more details with `?si`). ## Heavy precipitation days (R10mm) Counting and plotting the number of days in the period where precipitation is > 10 [mm]: ```{r R10mm} ( R10mm <- length( x[x>10] ) ) ``` ## Very wet days (R95p) Identifying the wet days (daily precipitation >= 1 mm): ```{r wet_index} wet.index <- which(x >= 1) ``` Computing the 95th percentile of precipitation on wet days (*PRwn95*): ```{r PRwn95} ( PRwn95 <- quantile(x[wet.index], probs=0.95, na.rm=TRUE) ) ``` **Note 1**: this computation was carried out for the three-year time period 1988-1990, not the 30-year period 1961-1990 commonly used. **Note 2**: missing values are removed from the computation. Identifying the very wet days (daily precipitation >= *PRwn95*): ```{r very_wet_index} (very.wet.index <- which(x >= PRwn95)) ``` Computing the total precipitation on the very wet days: ```{r R95p} ( R95p <- sum(x[very.wet.index]) ) ``` **Note 3**: this computation was carried out for the three-year time period 1988-1990, not the 30-year period 1961-1990 commonly used ## 5-day total precipitation Computing the 5-day total (accumulated) precipitation: ```{r x_5max} x.5max <- rollapply(data=x, width=5, FUN=sum, fill=NA, partial= TRUE, align="center") hydroplot(x.5max, ptype="ts+boxplot", pfreq="o", var.unit="mm") ``` Maximum annual value of 5-day total precipitation: ```{r (x_5max_annual} (x.5max.annual <- daily2annual(x.5max, FUN=max, na.rm=TRUE)) ``` **Note 1**: for this computation, a moving window centred in the current day is used. If the user wants the 5-day total precipitation accumulated in the 4 days before the current day + the precipitation in the current day, the user have to modify the moving window.\newline **Note 2**: For the first two and last two values, the width of the window is adapted to ignore values not within the time series # Climograph Since v0.5-0, `hydroTSM` includes a function to plot a climograph, considering not only precipitation but air temperature data as well. ```{r climograph1, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9)} # Loading daily ts of precipitation, maximum and minimum temperature data(MaquehueTemuco) # extracting individual ts of precipitation, maximum and minimum temperature pcp <- MaquehueTemuco[, 1] tmx <- MaquehueTemuco[, 2] tmn <- MaquehueTemuco[, 3] ``` Plotting a full climograph: ```{r climograph2, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9)} m <- climograph(pcp=pcp, tmx=tmx, tmn=tmn, na.rm=TRUE, main="Maquehue Temuco Ad (Chile)", lat=-38.770, lon=-72.637) ``` \newpage Plotting a climograph with uncertainty bands around mean values, but with no labels for tmx and tmn: ```{r climograph3, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9)} m <- climograph(pcp=pcp, tmx=tmx, tmn=tmn, na.rm=TRUE, tmx.labels=FALSE, tmn.labels=FALSE, main="Maquehue Temuco Ad (Chile)", lat=-38.770, lon=-72.637) ``` \newpage Plotting a climograph with uncertainty bands around mean values, but with no labels for tmx, tmn and pcp: ```{r climograph4, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9)} m <- climograph(pcp=pcp, tmx=tmx, tmn=tmn, na.rm=TRUE, pcp.labels=FALSE, tmean.labels=FALSE, tmx.labels=FALSE, tmn.labels=FALSE, main="Maquehue Temuco Ad (Chile)", lat=-38.770, lon=-72.637) ``` \newpage To better represent the hydrological year in Chile (South America), the following figure will plot a full climograph starting in April (`start.month=4`) instead of January (`start.month=1`): ```{r climograph5, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9)} m <- climograph(pcp=pcp, tmx=tmx, tmn=tmn, na.rm=TRUE, start.month=4, temp.labels.dx=c(rep(-0.2,4), rep(0.2,6),rep(-0.2,2)), main="Maquehue Temuco Ad (Chile)", lat=-38.770, lon=-72.637) ``` # Software details This tutorial was built under: ```{r echo=FALSE} sessionInfo()$platform sessionInfo()$R.version$version.string paste("hydroTSM", sessionInfo()$otherPkgs$hydroTSM$Version) ``` # Version history * v0.9: Jan 2024 * v0.8: Nov 2023 * v0.7: Mar 2020 * v0.6: Aug 2017 * v0.5: May 2013 * v0.4: Aug 2011 * v0.3: Apr 2011 * v0.2: Oct 2010 * v0.1: 30-May-2013 # Appendix In order to make easier the use of \texttt{hydroTSM} for users not familiar with R, in this section a minimal set of information is provided to guide the user in the [R](https://www.r-project.org/) world. ## Editors, GUI * **Multi-platform**: [Sublime Text](https://sublime.weberup.com/) (https://sublime.weberup.com/) ; [RStudio](https://posit.co/) (https://posit.co/) * **GNU/Linux only**: [ESS](https://ess.r-project.org/) (https://ess.r-project.org/) * **Windows only** : [NppToR](https://sourceforge.net/projects/npptor/) (https://sourceforge.net/projects/npptor/) ## Importing data * `?read.table`, `?write.table`: allow the user to read/write a file (in $~$table format) and create a data frame from it. Related functions are `?read.csv`, `?write.csv`, `?read.csv2`, `?write.csv2`. * `?zoo::read.zoo`, `?zoo::write.zoo`: functions for reading and writing time series from/to text files, respectively. * [**R Data Import/Export**](https://cran.r-project.org/doc/manuals/r-release/R-data.html): https://cran.r-project.org/doc/manuals/r-release/R-data.html * [**foreign** R package](https://cran.r-project.org/package=foreign): read data stored in several R-external formats (dBase, Minitab, S, SAS, SPSS, Stata, Systat, Weka, ...) * [**readxl** R package](https://cran.r-project.org/package=readxl): Import MS Excel files into R. * [**some examples**](https://www.statmethods.net/data-input/importingdata.html): https://www.statmethods.net/data-input/importingdata.html ## Useful Websites * [**Quick R**](https://www.statmethods.net/): https://www.statmethods.net/ * [**Time series in R**](https://cran.r-project.org/view=TimeSeries): https://cran.r-project.org/view=TimeSeries * [**Quick reference for the `zoo` package**](https://cran.r-project.org/package=zoo/vignettes/zoo-quickref.pdf): https://cran.r-project.org/package=zoo/vignettes/zoo-quickref.pdf ## F.A.Q. # How to print more than one `matrixplot` in a single Figure? Because `matrixplot` is based on lattice graphs, normal plotting commands included in base R does not work. Therefore, for plotting ore than 1 matrixplot in a single figure, you need to save the individual plots in an R object and then print them as you want. In the following sequential lines of code, you can see two examples that show you how to plot two matrixplots in a single Figure: ```{r FAQmatrixplot1, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9)} library(hydroTSM) data(SanMartinoPPts) x <- window(SanMartinoPPts, end=as.Date("1960-12-31")) m <- daily2monthly(x, FUN=sum, na.rm=TRUE) M <- matrix(m, ncol=12, byrow=TRUE) colnames(M) <- month.abb rownames(M) <- unique(format(time(m), "%Y")) p <- matrixplot(M, ColorRamp="Precipitation", main="Monthly precipitation,") print(p, position=c(0, .6, 1, 1), more=TRUE) print(p, position=c(0, 0, 1, .4)) ``` The second and easier way allows you to obtain the same previous figure (not shown here), but you are required to install the `gridExtra` package: ```{r FAQmatrixplot2, fig.width = 8, fig.height = 7, , dpi=100, fig.align = "center", dev.args=list(pointsize = 9), eval=TRUE} if (!require(gridExtra)) install.packages("gridExtra") require(gridExtra) # also loads grid require(lattice) grid.arrange(p, p, nrow=2) ```