Skip to content

Latest commit

 

History

History
50 lines (38 loc) · 2.68 KB

README.md

File metadata and controls

50 lines (38 loc) · 2.68 KB

csvz: a pretty-quick csv parser/reader utility

Parser is stable and performing well.

This tool is not really useful yet; it just parses csv files (quickly!). The plan is to implement an efficient sink conforming to the Arrow columnar format

(naive) benchmarks: please visit MarkPFlug/Benchmarks for a comparison with other csv parsers. Note that the measurements below are string-only access (second table on the Benchmarks page). --release=safe is cracking the top 10, and --release=fast is cracking the top 5.

$ zig test src/tests.zig
All 49 tests passed.

$ zig build --release=safe

$ hyperfine --warmup=5 './zig-out/bin/csvz $PWD/data/data_65k_records.csv'
Benchmark 1: ./zig-out/bin/csvz $PWD/data/data_65k_records.csv
  Time (mean ± σ):      34.3 ms ±   0.9 ms    [User: 22.5 ms, System: 12.0 ms]
  Range (min … max):    31.9 ms …  35.6 ms    83 runs

$ poop './zig-out/bin/csvz data/data_65k_records.csv'
Benchmark 1 (148 runs): ./zig-out/bin/csvz data/data_65k_records.csv
  measurement          mean ± σ            min … max           outliers
  wall_time          33.7ms ± 2.26ms    30.4ms … 44.2ms          1 ( 1%)
  peak_rss           41.0MB ±    0      41.0MB … 41.0MB          0 ( 0%)
  cpu_cycles         83.6M  ± 2.27M     80.6M  … 91.4M           2 ( 1%)
  instructions        212M  ± 6.03       212M  …  212M           9 ( 6%)
  cache_references   2.06M  ± 37.1K     1.84M  … 2.15M           3 ( 2%)
  cache_misses       1.33M  ± 76.3K     1.21M  … 1.57M           0 ( 0%)
  branch_misses       475K  ± 3.72K      470K  …  514K           6 ( 4%)



$ zig build --release=fast

$ hyperfine --warmup=5 './zig-out/bin/csvz $PWD/data/data_65k_records.csv'
Benchmark 1: ./zig-out/bin/csvz $PWD/data/data_65k_records.csv
  Time (mean ± σ):      20.6 ms ±   0.4 ms    [User: 15.3 ms, System: 5.4 ms]
  Range (min … max):    19.3 ms …  21.5 ms    128 runs
 
$ poop './zig-out/bin/csvz data/data_65k_records.csv'
Benchmark 1 (234 runs): ./zig-out/bin/csvz data/data_65k_records.csv
  measurement          mean ± σ            min … max           outliers
  wall_time          21.4ms ± 1.10ms    18.8ms … 30.0ms         10 ( 4%)
  peak_rss           14.5MB ±    0      14.5MB … 14.5MB          0 ( 0%)
  cpu_cycles         56.8M  ±  343K     56.5M  … 60.1M           9 ( 4%)
  instructions        136M  ± 3.83       136M  …  136M          14 ( 6%)
  cache_references    216K  ± 7.89K      196K  …  237K           0 ( 0%)
  cache_misses       1.39K  ±  596       706   … 4.86K           2 ( 1%)
  branch_misses       461K  ± 1.24K      458K  …  464K           0 ( 0%)