Using Ifstream to Read Files and Columns
As a data scientist, reading and writing data from/to CSV is 1 of the most common tasks I do on the daily. R, my language of option, makes this like shooting fish in a barrel with read.csv()
and write.csv()
(although I tend to apply fread()
and fwrite()
from the data.table package).
Hot Have . C++ is not R.
As far equally I know, there is no CSV reader/writer built into the C++ STL. That'south not a knock confronting C++; it's just a lower level linguistic communication. If nosotros want to read and write CSV files with C++, we'll accept to deal with File I/O, data types, and some low level logic on how to read, parse, and write data. For me, this is a necessary pace in lodge to build and test more fun programs like machine learning models.
Writing to CSV
We'll start by creating a uncomplicated CSV file with one cavalcade of integer information. And we'll requite it the header Foo.
#include <fstream> int main () { // Create an output filestream object std :: ofstream myFile ( "foo.csv" ); // Ship data to the stream myFile << "Foo \n " ; myFile << "1 \n " ; myFile << "ii \northward " ; myFile << "iii \due north " ; // Close the file myFile . close (); render 0 ; }
Here, ofstream is an "output file stream". Since it's derived from ostream, we tin can treat information technology just like cout (which is as well derived from ostream). The result of executing this plan is that we get a file called foo.csv in the same directory as our executable. Let's wrap this into a write_csv()
part that's a little more dynamic.
#include <cord> #include <fstream> #include <vector> void write_csv ( std :: string filename , std :: cord colname , std :: vector < int > vals ){ // Brand a CSV file with one column of integer values // filename - the name of the file // colname - the proper noun of the one and simply column // vals - an integer vector of values // Create an output filestream object std :: ofstream myFile ( filename ); // Ship the column name to the stream myFile << colname << " \n " ; // Send data to the stream for ( int i = 0 ; i < vals . size (); ++ i ) { myFile << vals . at ( i ) << " \n " ; } // Close the file myFile . close (); } int main () { // Make a vector of length 100 filled with 1s std :: vector < int > vec ( 100 , 1 ); // Write the vector to CSV write_csv ( "ones.csv" , "Col1" , vec ); render 0 ; }
Cool. Now we can apply write_csv()
to write a vector of integers to a CSV file with ease. Let'south expand on this to support multiple vectors of integers and corresponding column names.
#include <string> #include <fstream> #include <vector> #include <utility> // std::pair void write_csv ( std :: string filename , std :: vector < std :: pair < std :: string , std :: vector < int >>> dataset ){ // Make a CSV file with 1 or more columns of integer values // Each column of information is represented by the pair <column name, column data> // as std::pair<std::string, std::vector<int>> // The dataset is represented as a vector of these columns // Annotation that all columns should be the same size // Create an output filestream object std :: ofstream myFile ( filename ); // Ship column names to the stream for ( int j = 0 ; j < dataset . size (); ++ j ) { myFile << dataset . at ( j ). beginning ; if ( j != dataset . size () - 1 ) myFile << "," ; // No comma at end of line } myFile << " \north " ; // Send data to the stream for ( int i = 0 ; i < dataset . at ( 0 ). second . size (); ++ i ) { for ( int j = 0 ; j < dataset . size (); ++ j ) { myFile << dataset . at ( j ). second . at ( i ); if ( j != dataset . size () - one ) myFile << "," ; // No comma at end of line } myFile << " \northward " ; } // Shut the file myFile . close (); } int chief () { // Make three vectors, each of length 100 filled with 1s, 2s, and 3s std :: vector < int > vec1 ( 100 , 1 ); std :: vector < int > vec2 ( 100 , 2 ); std :: vector < int > vec3 ( 100 , three ); // Wrap into a vector std :: vector < std :: pair < std :: string , std :: vector < int >>> vals = {{ "One" , vec1 }, { "Two" , vec2 }, { "Three" , vec3 }}; // Write the vector to CSV write_csv ( "three_cols.csv" , vals ); return 0 ; }
Here we've represented each column of data equally a std::pair
of <column proper name, column values>
, and the whole dataset every bit a std::vector
of such columns. Now we tin can write a variable number of integer columns to a CSV file.
Reading from CSV
Now that nosotros've written some CSV files, permit's attempt to read them. For now allow's correctly assume that our file contains integer data plus one row of column names at the elevation.
#include <string> #include <fstream> #include <vector> #include <utility> // std::pair #include <stdexcept> // std::runtime_error #include <sstream> // std::stringstream std :: vector < std :: pair < std :: cord , std :: vector < int >>> read_csv ( std :: string filename ){ // Reads a CSV file into a vector of <cord, vector<int>> pairs where // each pair represents <column name, column values> // Create a vector of <string, int vector> pairs to store the result std :: vector < std :: pair < std :: cord , std :: vector < int >>> effect ; // Create an input filestream std :: ifstream myFile ( filename ); // Make sure the file is open if ( ! myFile . is_open ()) throw std :: runtime_error ( "Could non open file" ); // Helper vars std :: string line , colname ; int val ; // Read the column names if ( myFile . practiced ()) { // Excerpt the get-go line in the file std :: getline ( myFile , line ); // Create a stringstream from line std :: stringstream ss ( line ); // Extract each column name while ( std :: getline ( ss , colname , ',' )){ // Initialize and add together <colname, int vector> pairs to result result . push_back ({ colname , std :: vector < int > {}}); } } // Read data, line by line while ( std :: getline ( myFile , line )) { // Create a stringstream of the current line std :: stringstream ss ( line ); // Go along track of the current column alphabetize int colIdx = 0 ; // Extract each integer while ( ss >> val ){ // Add the current integer to the 'colIdx' column's values vector event . at ( colIdx ). second . push_back ( val ); // If the next token is a comma, ignore it and move on if ( ss . peek () == ',' ) ss . ignore (); // Increment the column index colIdx ++ ; } } // Shut file myFile . close (); return outcome ; } int main () { // Read three_cols.csv and ones.csv std :: vector < std :: pair < std :: string , std :: vector < int >>> three_cols = read_csv ( "three_cols.csv" ); std :: vector < std :: pair < std :: string , std :: vector < int >>> ones = read_csv ( "ones.csv" ); // Write to another file to cheque that this was successful write_csv ( "three_cols_copy.csv" , three_cols ); write_csv ( "ones_copy.csv" , ones ); render 0 ; }
This program reads our previously created CSV files and writes each dataset to a new file, substantially creating copies of our original files.
Going further
So far we've seen how to read and write datasets with integer values only. Extending this to read/write a dataset of only doubles or only strings should be fairly straight-forrad. Reading a dataset with unknown, mixed data types is another beast and beyond the telescopic of this commodity, simply meet this code review for possible solutions.
Special thank you to papagaga and Incomputable for helping me with this topic via codereview.stackexchange.com.
carsonmiturnenings.blogspot.com
Source: https://www.gormanalysis.com/blog/reading-and-writing-csv-files-with-cpp/
0 Response to "Using Ifstream to Read Files and Columns"
Post a Comment