Significant Digits

Everyone who has taken a first year chemistry class has learned that significant digits (aka “significant figures” or “sig figs”) indicate the precision of a measurement. The basic rule is that you save all measurement digits you are certain about plus one more that you estimate. Unfortunately, computers don’t know anything about significant digits. Developers creating data systems for scientific measurements should always include a rounding step as part of any data output. Not embracing significant digits can have … uhm … “significant” consequences.

Continue reading “Significant Digits”

Systematic error messages

Anyone writing code for use in data processing systems needs to have a well thought-out protocol for generating error messages and logs. When a complex pipeline breaks, good logs and recognizable error messages are key to debugging the problem. This post describes improvements to the MazamaCoreUtils package that help you create systematic error messages that can be better handled by calling functions.

Continue reading “Systematic error messages”

Easy Rolling Means with MazamaRollUtils

Our goal in creating a new package of C++ rolling functions is to build up a suite of functions useful in environmental time series analysis. We want these functions to be available in a neutral environment with no underlying data model. The functions are as straightforward to use as is reasonably possible with a target audience of data analysts at any level of R expertise.

Continue reading “Easy Rolling Means with MazamaRollUtils”

Beautiful Maps with MazamaSpatialPlots

Many of us have become addicted to The NY Times COVID maps — maps of US state or county level data colored by cases, vaccinations, per capita infections, etc. While recreating maps like these in R is possible, it is disappointingly difficult. The just released MazamaSpatialPlots R package takes a first stab at remedying this situation.

Continue reading “Beautiful Maps with MazamaSpatialPlots”

Using R — Calling C code with Rcpp

In two previous posts we described how R can call C code with .C() and the more complex yet more robust option of calling C code with .Call().  Here we will describe how the Rcpp package can be used to greatly simplify your C code without forcing you to become expert in C++.

Continue reading “Using R — Calling C code with Rcpp”

Using R – Calling C code ‘Hello World!’

One of the reasons that R has so much functionality is that people have incorporated a lot of academic code written in C, C++, Fortran and Java into various packages.  Libraries written in these languages are often both robust and fast.  If you are using R to support people in a particular field, you may be called upon to incorporate some outside code into your R environment.  Unfortunately, much of the documentation on how to do this is written at a very high level.  In this post we will distil some of the available information on calling C code from R into three “Hello World” examples.

Continue reading “Using R – Calling C code ‘Hello World!’”

Ten UNIX commands every data manager should know

Working with data from varied sources can be frustrating — some data will be in CSV format; some in XML; some available as HTML pages; other data as relational databases or MS Excel spreadsheets.

This post will cover the UNIX tools that every data manager needs to be familiar with in order to work with varied data sources.

Continue reading “Ten UNIX commands every data manager should know”