Recently I’ve had the need to read uff files into R. This file format was developped at the University of Cincinnati to standardize the way vibration measurements are stored. It so happens there isn’t a library capable of importing uff files into R, but there is one written in python and another for Matlab.
I ran into a bug while trying to run the open modal library, which might have been easy to solve, but after trying this matlab library it worked straight away. This left me with the task: how to pass data arround between Matlab/octave into R? Here’s a list with the best options I’ve found for my use case.
Passing structures arround using json
One of my first attempts was to use json, sincce the uff files were imported in a structure full of metadata associated with each measurement. It seemed natural to export the whole structure to a json file, which I did using the function below:
function json_exporter(uff_struct,fname)
jsonStr = jsonencode(uff_struct);
fid = fopen(fname, 'w');
if fid == -1, error('Cannot create JSON file'); end
fwrite(fid, jsonStr, 'char');
fclose(fid);
end
Finally to get the data into R it is as simple as calling the package rjson:
library('rjson')
json_data <- fromJSON(file = '/path/to/file.json')
The bonus with this approach is that complicated data structures with different types can easily be exported. The downside is duplication of data, which was only in uff format and now has a json equivalent. Also the json equivalent takes more space in disk than the original uff files, what led me to the next memory friendlier approach.
Passing structured data with hdf5 files
Directly from Wikipedia:
Hierarchical Data Format (HDF) is a set of file formats (HDF4, HDF5) designed to store and organize large amounts of data. Originally developed at the National Center for Supercomputing Applications, it is supported by The HDF Group, a non-profit corporation whose mission is to ensure continued development of HDF5 technologies and the continued accessibility of data stored in HDF.
Strangely enough I had trouble using octave’s hdf5 file to load into R, while hdf5 files generated by Matlab could be read without any issue. Here’s how to save two matrices with name amplitude and time into a sample.hdf5 file:
%% Write h5file
h5create('sample.h5','/amplitude',size(amplitude))
h5write('sample.h5','/amplitude',amplitude)
h5create('sample.h5','/time',size(time))
h5write('sample.h5','/time',time)
h5disp('sample.h5')
To read the hdf5 file I used the h5 library since I did not notice it was already deprecated and hdf5r should be used instead:
library('h5')
file <- h5file("~/sample.h5",mode = 'r')
data<-file["data"]
amplitude<-file['/amplitude']
tempo<-file['/time']
adf<-as.data.frame(t(amplitude[]))
tdf<-as.data.frame(t(tempo[]))
Compared to json, hdf5 files had smaller memory footprint. It was my prefered method since exporting matrices to text files would be too cumbersome and in a single hdf5 file it is possible to store multiple variables.
Passing data real time: Redis
Since the work involves large ammount of data, I did not want to have to export the uff files into another file in order to manipulate it from R. Hence, I tried to use an in memory data store or a message broker called Redis. To install it just issue:
sudo apt install redis-server
To check if the server is up and running:
redis-cli ping
And check if you get PONG
as an answer.
Now, to export for isntance a matrix form matlab/octave to R I used the excellent go-redis package. It is worth noting their data structure page, where the first item stored is the size of the matrix, whereas the remaining information is stored in the redis range object. For instance, take the creation of an e matrix in octave and how to write it into redis:
# Octave code
addpath(genpath(pwd))
R=redisConnection();
e=rand(7,3)
redisSet(R,'e',e);
w=redisGet(R,'e');
w
And finally retrieving it in R:
library(rredis)
redisConnect()
list_length<-as.numeric(redisLLen('e')[1])
msize<-strsplit(redisLRange('e',0,0)[[1]][1],' ')
rownumber<-as.numeric(msize[[1]][1])
colnumber<-as.numeric(msize[[1]][2])
mdata<-redisLRange('e',1,list_length)
e<-matrix(as.numeric(unlist(mdata)),nrow = rownumber, ncol = colnumber)
This method is not as pretty as the others, but it is quite impressive to be able to enchange data seamlesssly between R and matlab/octave without any friction.
Hey, be the first who comment this article.