I work with lots of environmental time series data from stationary instruments. This post describes why you should avoid mixing data and metadata in a single file and instead suggests an easy-to-implement, easy-to-use, maximally compact format consisting of two .csv files linked by unique identifiers.
Continue reading “How NOT to format time series data”Author: jonathanscallahan
(re-)Learning Javascript
It’s been several years since I worked full time on a javascript project and a lot has improved in that time. This post will document my journey and will hopefully contain some best practice suggestions.
Continue reading “(re-)Learning Javascript”Significant Digits
Everyone who has taken a first year chemistry class has learned that significant digits (aka “significant figures” or “sig figs”) indicate the precision of a measurement. The basic rule is that you save all measurement digits you are certain about plus one more that you estimate. Unfortunately, computers don’t know anything about significant digits. Developers creating data systems for scientific measurements should always include a rounding step as part of any data output. Not embracing significant digits can have … uhm … “significant” consequences.
Continue reading “Significant Digits”Systematic error messages
Anyone writing code for use in data processing systems needs to have a well thought-out protocol for generating error messages and logs. When a complex pipeline breaks, good logs and recognizable error messages are key to debugging the problem. This post describes improvements to the MazamaCoreUtils package that help you create systematic error messages that can be better handled by calling functions.
Continue reading “Systematic error messages”Comparing Air Quality Sites
Air quality continues to be in the news with New York Times articles like these:
- Health Risks of Smoke and Ozone Rise in the West as Wildfire Worsen
- Even Low Levels of Soot Can Be Deadly to Older People, Research Finds
A quick review of web based air quality resources shows a range of sites featuring maps, time series plots and relevant information.
Continue reading “Comparing Air Quality Sites”Methow Valley Air Quality
Mazama Science has released a new set of tutorials demonstrating the use of air quality R packages to investigate data from regulatory monitors and low-cost sensors. This post is just a short summary of what the tutorials cover. We invite anyone interested in wildfire smoke and air quality to run through the tutorials and provide feedback.
Continue reading “Methow Valley Air Quality”Zero vs. Missing
On the left we have zero, our integer measure of nothingness. On the right we have missing value, aka N/A, aka NA, our signal that the value of a datapoint is unknown. Everyone who deals with data has to deal with this important distinction. And far too often people get it wrong.
Continue reading “Zero vs. Missing”MazamaSpatialUtils R package
Version 0.7 of the MazamaSpatialUtils is now available on CRAN and includes an expanded suite of spatial datasets with even greater cleanup and harmonization than in previous versions. If your work involves environmental monitoring of any kind, this package may be of use. Here is the description:
A suite of conversion functions to create internally standardized spatial polygons dataframes. Utility functions use these data sets to return values such as country, state, timezone, watershed, etc. associated with a set of longitude/latitude pairs. (They also make cool maps.)
In this post we discuss the reasons for creating this package and describe its main features.
Continue reading “MazamaSpatialUtils R package”Data producers vs. data consumers
In the marketplace, the needs of producers and consumers are often at odds: producers want higher prices, consumers lower ones; producers want easy assembly, consumers easy dis-assembly; producers want flexibility and rapid prototyping, consumers reliability and long-term support.
The same competing needs exist in the world of scientific data management where producers of data and consumers of data often operate in very different worlds with very different sets of tools.
Continue reading “Data producers vs. data consumers”When is a number not a number?
Have you ever asked yourself whether your telephone number is really a number? It’s got numbers in it but does it measure anything?
How about your credit card number? PO Box? Social Security Number? Zip code? What would happen if you subtracted one of these from another?
As it turns out, many of the “numbers” we deal with every day are actually identifiers and not a measure of something. Sadly, too many data managers do not distinguish between the two even though making this distinction is quite simple.
Continue reading “When is a number not a number?”