Quantcast
Channel: Active questions tagged header - Stack Overflow
Viewing all articles
Browse latest Browse all 699

Loading data from .h5 files and then do operations on top of it seems much slower than directly operating arrays [closed]

$
0
0

Say I have a data.h5 file containing a 1D float array. I then have a header file called open_data.h:

#ifndef PROJECT#define PROJECT#include <iostream>#include <typeinfo>#include "hdf5.h"template <typename T1, typename T2>void open_data(T1& data, const char* path,               const char* dataset_path, T2 indicator) {    // indicator is just an arbitrary number of int or double    // for H5DRead to tell if to use H5T_NATIVE_FLOAT or    // H5T_NATIVE_INT.    // Specify the HDF5 file name    const char* filename = path;    // Open the HDF5 file    hid_t file_id = H5Fopen(filename, H5F_ACC_RDONLY, H5P_DEFAULT);    // Open the dataset    hid_t dataset_id = H5Dopen2(file_id, dataset_path, H5P_DEFAULT);    // Read data from the dataset    if (typeid(indicator) == typeid(int)) {        H5Dread(dataset_id, H5T_NATIVE_INT, H5S_ALL,                H5S_ALL, H5P_DEFAULT, data);    }    else {        H5Dread(dataset_id, H5T_NATIVE_FLOAT, H5S_ALL,                H5S_ALL, H5P_DEFAULT, data);    }    // Close the dataset    H5Dclose(dataset_id);    // Close the HDF5 file    H5Fclose(file_id);}#endif

I have a function to do some simple operations, in the function.h header file:

float multiply(int i, int j, float* arr) {    return arr[i] * arr[j];} 

Now, in main.cpp, I have to firstly load the data from .h5 file to an array and then call the function multiply:

#include "open_data.h"#include "function.h"#include <iostream>float[10000] arr;int main {    const char* path = "/users/tony/Desktop/data.h5"    open_data(arr, path, "/arr", 2.0);    float res = 0.0;    for (int i=0; i<10000; ++i) {        res += multiply(i, 0, arr);    }    return 0;}

But if I do not load data to arr but just use the randomly initialled values, i.e.

#include "open_data.h"#include "function.h"#include <iostream>float[10000] arr;int main {    float res = 0.0;    for (int i=0; i<10000; ++i) {        res += multiply(i, 0, arr);    }    return 0;}

The code is faster. I don't know why? The speed difference is much more significant using higher-dimensional arrays, although I did not show this explicitly here. The number of floating point operations is the same but I just load data to the array. This operation is done for ONCE and I have counted the overhead for this. I have subtracted this overhead as a fair comparision. How to optimise this?


Viewing all articles
Browse latest Browse all 699

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>