Skip to content

AriaFallah/csv-parser

Repository files navigation

CSV Parser

Fast, simple, header-only, C++11 CSV parser.

Usage

Configuration

You initialize the parser by passing it any input stream of characters. For example, you can read from a file

std::ifstream f("some_file.csv");
CsvParser parser(f);

or you can read from stdin

CsvParser parser(std::cin);

The parser can also own the input stream, which is safer when the parser needs to outlive the scope where the stream is created.

auto parser = CsvParser::from_file("some_file.csv");

or

std::unique_ptr<std::istream> input(new std::ifstream("some_file.csv"));
CsvParser parser(std::move(input));

When using the std::istream& constructor, the caller must keep the stream alive for at least as long as the parser.

Moreover, you can configure the parser by chaining configuration methods like

CsvParser parser = CsvParser(std::cin)
  .delimiter(';')    // delimited by ; instead of ,
  .quote('\'')       // quoted fields use ' instead of "
  .terminator('\0'); // terminated by \0 instead of by \r\n, \n, or \r

Parsing

You can read from the CSV using a range based for loop. Each row of the CSV is represented as a std::vector<std::string>.

#include <iostream>
#include "../parser.hpp"

using namespace aria::csv;

int main() {
  std::ifstream f("some_file.csv");
  CsvParser parser(f);

  for (auto& row : parser) {
    for (auto& field : row) {
      std::cout << field << " | ";
    }
    std::cout << std::endl;
  }
}

Behind the scenes, when using the range based for, the parser only ever allocates as much memory as needed to represent a single row of your CSV. If that's too much, you can step down to a lower level, where you read from the CSV a field at a time, which only allocates the amount of memory needed for a single field.

#include <iostream>
#include "./parser.hpp"

using namespace aria::csv;

int main() {
  CsvParser parser(std::cin);

  for (;;) {
    auto field = parser.next_field();
    switch (field.type) {
      case FieldType::DATA:
        std::cout << field.data << " | ";
        break;
      case FieldType::ROW_END:
        std::cout << std::endl;
        break;
      case FieldType::CSV_END:
        std::cout << std::endl;
        return 0;
    }
  }
}

It is possible to inspect the current cursor position using parser.position(). This will return the position of the last parsed token. This is useful when reporting things like progress through a file. You can use file.seekg(0, std::ios::end); to get a file size.

Testing

Run the unit tests with:

cmake -S test -B test/out
cmake --build test/out
./test/out/parser_test

Property tests are opt-in and use RapidCheck:

cmake -S test -B test/out -DARIA_CSV_ENABLE_PROPERTY_TESTS=ON
cmake --build test/out
./test/out/parser_property_test

Fuzz targets live in fuzz/. See fuzz/README.md for libFuzzer and AFL++ commands.

About

Fast, header-only, extensively tested, C++11 CSV parser

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors