Preliminary test results of each file format

Views: 119

preliminary test results of each file format

Prepare, Extract, Ingest

No one want to read extracted data with text editor and look up relevant data without any help from software.
Data accessibility requires Ingesting data to data store and/or software service.

Test results for following 3 steps; Data Preparation, Extraction, and Ingestion

Extraction Results

Extract time and file size for each file format.
One year (12,352 records) of General Ledger entry data of Small Entity. Generated 4 files for each quarter periods.

There is difference among each file. XBRL GL took more time and file size.
BUT these difference has less importance with today’s IT environment.
Software is written in PHP
Hardware Amazon Web Service EC2 Instance
Model:t2.small vCPU:1 Memory:2GiB Storage EBS-Only
model name : Intel(R) Xeon(R) CPU E5-2670 v2 @ 2.50GHz cpu MHz : 2500.060 cache size : 25600 KB

Ingest Result


There is no significant difference among each file
Step 1 depends on network speed and file size but it is negligible .
Step 2 ~ 3 differs by file format but it is negligible.
Execution time consists mostly of Step 4 ~ 5.
Software is written in Javascript(jQuery) and DataTables library
Hardware MacBook Pro Processor 2.5GHz Core i5 Memory 8 GB 1600 MHz DDR3

Findings

Extract

Extracted file size differs by file format but these difference has little effect for todays rich IT environment.
Extraction time also differs by file format but we should think total turnaround time including preparation time.

Ingest

Difference by file format is negligible for ingesting. Execution time consists mostly of formatting table.

Schema file

Schema is necessary for ingesting software to verify data and to import data correctly.

Each file format

CSV: gl_detail_20090401_20090630.csv
XBRL GL: gl_detail_20090401_20090630.xml
JSON-LD: gl_detail_20090401_20090630.json


Posted

in

,

by

Tags: