Thursday, July 25, 2013

Including binary files in an R package

The R package format provides support for data in standard formats (.R, .Rdata, .csv) in the data/ directory. Unfortunately, data in unsupported formats (e.g. audio files, images, SQLite databases) is ignored by the package build command.

The solution, as hinted at in the manual, is to place such data in the inst/extdata/ directory:
"It should not be used for other data files needed by the package, and the convention has grown up to use directory inst/extdata for such files."

Using a SQLite database file as an example, an R package can provide a default database by including the path to the built-in database as a default parameter to functions. Because the path is determined at runtime, the best solution is to include an exported function that provides the path to the built-in database:

pkg.default.database <- font="" function="">
    system.file('extdata', 'default_db.sqlite', package='pkg')
}

In this example, the package name is pkg, and the SQLite database file is inst/extdata/default_db.sqlite.

Package functions that take a path to the SQLite database can then invoke this function as a default parameter. For example:

pkg.fetch.rows <- db="pk.default.database()," font="" function="" limit="NULL)">
        # Connect to database

conn <- db="" dbconnect="" font="" ite="">
if (! dbExistsTable(conn, 'sensor_data')) {
 warning(paste('Table SENSOR_DATA does not exist in', db))
 dbDisconnect(conn)
 return(NULL)
}

        # build query for table SENSOR_DATA
        query <- font="" from="" sensor_data="">
if (! is.null(where) ) {
query <- font="" paste="" query="" where="">
}

        # send query and retrieve rows as a dataframe
ds <- conn="" dbsendquery="" font="" query="">
        df <- ds="" fetch="" n="-1)</font">

        # cleanup
        dbClearResult(ds)
dbDisconnect(conn)

        return(df)
}