Skip to contents

Introduction

This vignette demonstrates how to use the pins package to access products in the PFU database.

Preparation

Take the following steps to prepare for accessing PFU Database products.

(1) Obtain Access to IEA EWEB Data

The PFU database contains Extended World Energy Balance (EWEB) data. To use the PFU Database, you or your institution must already have purchased the EWEB data, obtained a license to use the EWEB data, or subscribed to a data service with access to the EWEB data. At present, you must have access to all countries and all years of EWEB data to access the PFU Database.

(2) Install Dropbox

At present, the PFU Database is distributed via a shared Dropbox folder; thus, users of the database must install Dropbox. At present (May 2023), data in the Dropbox folder are 12 GB. The database will grow in size in the future. You may need to upgrade your Dropbox plan if that amount of data will cause you to exceed your current storage limit.

(3) Receive and accept an invitation

A member of the PFU Database team will provide read-only access to the Dropbox folder in which the PFU Database products are stored. Accept the invitation and let Dropbox sync the PFU Database folder to your computer.

The folder containing PFU Database products is called PipelineReleases.

(4) Become familiar with the pins package

The pins package is the mechanism by which versions of the PFU Database and its products are stored, maintained, and distributed. Please review the pins package before accessing the PFU database data.

(5) Become familiar with the PFU Database and products

Strictly speaking, the PFU Database consists of only primary, final, and useful energy and exergy data for all countries and all years available in the IEA EWEB data. The data are arranged in Physical Supply Use Table (PSUT) format as described in the paper by Heun et al. entitled “A physical supply-use table framework for energy analysis on the energy conversion chain.” The data are stored in the Dropbox folder as .rds files in matsindf format.

In addition to the database itself, there are several data products available, most of which are aggregations, subsets, or other calculations made from the database. For example, primary, final, and useful energy and exergy aggregations are available for each country and each year (data product C). Primary-to-final, final-to-useful, and primary-to-final efficiencies are also available for each country and each year (data product D).

The database, its versions, and its data products are documented in the file named versions and products.xlsx at the top level of the PipelineReleases folder. Look through that file to determine which PFU Database products are needed for your research. In particular, note the “Pin name” and the “Pin version” of the data products that you desire.

Note: there may be other versions of pins in the Dropbox folder. Ignore all versions of pins except those identified in the versions and products.xlsx file.

Accessing the PFU Database and other data products

To load the PFU Database itself into your R session, supply the pin name and pin version to pins functions. The following code loads v1 of the database itself, using the name of the pin (“psut”) and the version string for version 1 of the database (“20221109T152414Z-7d7ad”).

library(pins)
pfu_pinboard <- pins::board_folder(path = "~/Dropbox/PipelineReleases", versioned = TRUE)
psut_data_frame <- pins::pin_read(board = pfu_pinboard, name = "psut", 
                                  version = "20221109T152414Z-7d7ad")

Because the database contains a large amount of data, it may take a minute or two to load.

Although we recommend against it, you may load the latest version of the same pin by not specifying the version string.

psut_data_frame_latest <- pins::pin_read(board = pfu_pinboard, name = "psut")

The procedure is the same to load any data product in the database. For example, data product A is a subset of the database that contains “USA” only. Product A may be useful for testing, because it is much smaller in memory than the full database and loads much faster. To load product A for version 1 of the database, use the following code.

psut_data_frame_usa <- pins::pin_read(board = pfu_pinboard, name = "psut_usa", 
                                      version = "20230217T182459Z-287a0")

To select only a single country of data, use the following code.

col_only <- psut_data_frame |> 
  dplyr::filter(Country == "COL")

Questions

Please reach out to a member of the PFU Database team with questions.