Search Posts

Visits: 246

General Ledger journal entry

Consider how to define tidy data for General Ledger journal entry. The following is a journal entry example for purchasing paper with cash.

Table 1. General Ledger journal entry
Date Line Description Debit
(JPY)
Credit
(JPY)
Account Number Account Name C N T ID

2009-04-03

用紙
paper

14,858

754

事務用消耗品費
Office consumables expenses

10

札幌
Sapporo

部門
Business unit

943-8-477

2009-04-03

742

191

仮払消費税等
Temporary payment consumption tax

10

札幌
Sapporo

部門
Business unit

943-8-477

2009-04-03

用紙
paper

15,600

111

現金
Cash

0

共通部門
Common department

部門
Business unit

943-8-477

Key C: Business Segment Code, N: Business Segment Name, T: Business Segment Type

Tidy data examples in Wickham

Followings are examples in Wickham[_1], which provides some data about an imaginary experiment in a format commonly seen in the world.

Our vocabulary of rows and columns is simply not rich enough to describe why the two tables represent the same data. In addition to appearance, we need a way to describe the underlying semantics, or meaning, of the values displayed in tables

— Headley Wickham
Tidy Data Journal of Statistical Software August 2014 Volume 59 Issue 10
Table 2. Typical presentation dataset
treatment A treatment B

John Smith

2

Jane Doe

16

11

Mary Johnson

3

1

Table 3. The same data as in Table 2 but structured differently
John Smith Jane Doe Mary Johnson

treatment A

16

3

treatment B

2

11

1

Clause 2.2 in in Wickham[_1] describes Data semantics.

A dataset is a collection of values. Values are organized in two ways. Every value belongs to a variable and an obserbation.
Table 4 reorganizes Table 1 to make the values, variables and observations more clear. The variables are:
1. person, with three possible values (John Smith, Mary Johnson, and Jane Doe)
2. treatment, with two possible values (A and B)
3. result, with five or six values depending on how you think of the missing value (-, 16, 3, 2, 11, 1).

— Headley Wickham
Tidy Data Journal of Statistical Software August 2014 Volume 59 Issue 10
Table 4. The same data as in Table 1 but with variables in columns and observations in rows.
person treatment result value

John Smith

A

Jane Doe

A

16

Mary Johnson

A

3

John Smith

B

2

Jane Doe

B

11

Mary Johnson

B

1

Wickham[_1] defines tidy data to describe the underlying semantics of the values displayed in tables.

In tidy data:
— Each variable forms a column.
— Each observation forms a row.
— Each type of observational unit forms a table.

— Headley Wickham
Tidy Data Journal of Statistical Software August 2014 Volume 59 Issue 10

Semantic structure of General Ledger

Table 5 shows Semantic structure of General Ledger. This hierarhical data structure represeents how variables are organized in digital ledger, which we are going to reorganize to a single tidy table.

Table 5. Semantic structure of General Ledger
ID H Element O R

A25

0

General Ledger

0..n

Aggregation

A25-01

1

•Journal ID

1..1

Identifier

A25-02

1

•Journal Number

0..1

Code

A25-09

1

•Header Description

0..1

Text

A25-10 (A25-A26)

1

•General Ledger Line

1..n

Aggregation

A26-03 (A26-A09)

2

••Chart Of Accounts

1..1

Aggregation

A26-A09-01

3

•••GL Account Number

1..1

Identifier

A26-A09-02

3

•••GL Account Name

0..1

Text

A26-A09-05

3

•••Account Type

0..1

Code

A26-A09-06

3

•••Account Subtype

0..1

Code

A25-A26-06

2

••Effective Date

0..1

Date

A25-A26-08

2

••Journal Entry Type Code

0..1

Code

A25-A26-07

2

••Line Number

0..1

Code

A25-A26-10

2

••Line Description

0..1

Text

A25-A26-11

2

••Source Code

0..1

Code

A25-A26-12

2

••Bill Number

0..1

Code

A25-A26-14

2

••Bill Date

0..1

Date

A25-A26-27

2

••Credit Debit Indicator

0..1

Code

A26-18 (A26-A89)

2

••Multi Currency Amount

1..n

Aggregation

A26-A89-01

3

•••Amount

1..1

Amount

A26-A89-02

3

•••Currency

1..1

Code

A26-41 (A26-A01)

2

••Business Segment

1..n

Aggregation

A26-A01-03

3

•••Segment Reference Level

1..1

Code

A26-A01-01

3

•••Business Segment Code

1..1

Code

A26-A01-02

3

•••Business Segment Name

1..1

Text

A26-A01-04

3

•••Business Segment Type

0..1

Text

Key H: Hierarchy Level, O: Occurence, R: Representation

Codd’s 3rd normal form tables for General Ledger

Traditional approach for representing above semantic structure is defining many Codd’s 3rd normal form tables and connecting these table with relational data base. Tidy data puts focus on a single data set instead of the many connected datasets.

Table 6, Table 7, Table 8, Table 9, and Table 10 are tables for the relational data base.

Table 6. General Ledger
ID H Element O R

A25

0

General Ledger

0..n

Aggregation

A25-01

1

•Journal ID

1..1

Identifier

A25-02

1

•Journal Number

0..1

Code

A25-09

1

•Header Description

0..1

Text

A25-10 (A25-A26)

1

•General Ledger Line

1..n

Aggregation

Key H: Hierarchy Level, O: Occurence, R: Representation

Table 7. General Ledger Line
ID H Element O R

A26

0

General Ledger Line

0..n

Aggregation

A26-03 (A26-A09)

1

•Chart Of Accounts

1..1

Aggregation

A26-06

1

•Effective Date

0..1

Date

A26-08

1

•Journal Entry Type Code

0..1

Code

A26-07

1

•Line Number

0..1

Code

A26-10

1

•Line Description

0..1

Text

A26-11

1

•Source Code

0..1

Code

A26-12

1

•Bill Number

0..1

Code

A26-14

1

•Bill Date

0..1

Date

A26-27

1

•Credit Debit Indicator

0..1

Code

A26-18 (A26-A89)

1

•Multi-Currency Amount

1..1

Aggregation

A26-41 (A26-A01)

1

•Business Segment

1..n

Aggregation

Key H: Hierarchy Level, O: Occurence, R: Representation

Table 8. Chart of Accounts
ID H Element O R

A09

0

Chart Of Accounts

0..n

Aggregation

A09-01

1

•GL Account Number

1..1

Identifier

A09-02

1

•GL Account Name

0..1

Text

A09-05

1

•Account Type

0..1

Code

A09-06

1

•Account Subtype

0..1

Code

Key H: Hierarchy Level, O: Occurence, R: Representation

Table 9. Multi-Currency Amount
ID H Element O R

A89

0

Multi-Currency Amount

0..n

Aggregation

A89-01

1

•Amount

1..1

Amount

A89-02

1

•Currency

1..1

Code

Key H: Hierarchy Level, O: Occurence, R: Representation

Table 10. Business Segment
ID H Element O R

A01

0

Business Segment

0..n

Aggregation

A01-03

1

•Segment Reference Level

1..1

Code

A01-01

1

•Business Segment Code

1..1

Code

A01-02

1

•Business Segment Name

1..1

Text

A01-04

1

•Business Segment Type

0..1

Text

Key H: Hierarchy Level, O: Occurence, R: Representation

Tidying messy data sets problems are specified in Wickham[_1] as follows:

— Column headers are values, not variable names.
— Multiple variables are stored in one column.
— Variables are stored in both rows and columns.
— Multiple types of observational units are stored in the same table.
— A single observational unit is stored in multiple tables.

— Headley Wickham
Tidy Data Journal of Statistical Software August 2014 Volume 59 Issue 10

Tidy data for General Ledger

The first step is detecting variable which is included aggregation table within a table. These variable requires are used for reporting the value if its occurrence is specified as multiple. A General Ledger table includes variables ,which are multiple occurrence aggregation tables, namely General Ledger Line, Multi Currency Amount, and Business Segment.
In our example we know that General Ledger is recorded in a single currency, Local Currency is Reporting Currency. We don’t need variable for multi-currency amount.
There are following three variables .
— d_A25 General Ledger
— d_A26 General Ledger Line
— d_A01 Business Segment
Following table is defined using these three variables and variable for concept as a column header and value column.
d_A26 column corresponding to the three entry lines.

Table 11. Almost Tidy data for General Ledger
d_A25 d_A26 d_A01 ID Element value

943-8-477

A25-01

Journal ID

943-8-477

943-8-477

A25-A26-08

JE Type Code

entries

943-8-477

A25-A26-07

Journal Line Number

8-477

943-8-477

A25-A26-14

Bill Date

2009-04-03

943-8-477

A25-A26-12

Bill Number

manual

943-8-477

A25-A26-11

Source Code

gj

943-8-477

A25-09

Header Description

General

943-8-477

A25-02

Journal Number

134

943-8-477

1

A26-A09-01

GL Account Number

754

943-8-477

1

A26-A09-02

GL Account Name

事務用消耗品費
Office consumables expenses

943-8-477

1

A26-A09-05

Account Type

account

943-8-477

1

A25-A26-27

Credit Debit Indicator

debit

943-8-477

1

A26-A89-01

Amount

14858

943-8-477

1

A26-A89-02

Currency

ISO4217:JPY

943-8-477

1

A25-A26-06

Effective Date

2009-04-03

943-8-477

1

A26-A09-06

Account Subtype

proposed

943-8-477

1

A25-A26-10

Line Description

用紙
paper

943-8-477

1

1

A26-A01-03

Segment Reference Level

account sub

943-8-477

1

1

A26-A01-01

Business Segment Code

10

943-8-477

1

1

A26-A01-02

Business Segment Name

札幌
Sapporo

943-8-477

1

1

A26-A01-04

Business Segment Type

部門
Business unit

943-8-477

2

A26-A09-01

GL Account Number

191

943-8-477

2

A26-A09-02

GL Account Name

仮払消費税等
Temporary payment consumption tax

943-8-477

2

A26-A09-05

Account Type

account

943-8-477

2

A25-A26-27

Credit Debit Indicator

debit

943-8-477

2

A26-A89-01

Amount

742

943-8-477

2

A26-A89-02

Currency

ISO4217:JPY

943-8-477

2

A25-A26-06

Effective Date

2009-04-03

943-8-477

2

A26-A09-06

Account Subtype

tax

943-8-477

2

1

A26-A01-03

Segment Reference Level

account sub

943-8-477

2

1

A26-A01-01

Business Segment Code

10

943-8-477

2

1

A26-A01-02

Business Segment Name

札幌
Sapporo

943-8-477

2

1

A26-A01-04

Business Segment Type

部門
Business unit

943-8-477

3

A26-A09-01

GL Account Number

111

943-8-477

3

A26-A09-02

GL Account Name

現金 Cash

943-8-477

3

A26-A09-05

Account Type

account

943-8-477

3

A25-A26-27

Credit Debit Indicator

credit

943-8-477

3

A26-A89-01

Amount

15600

943-8-477

3

A26-A89-02

Currency

ISO4217:JPY

943-8-477

3

A25-A26-06

Effective Date

2009-04-03

943-8-477

3

A26-A09-06

Account Subtype

proposed

943-8-477

3

A25-A26-10

Line Description

用紙
paper

943-8-477

3

1

A26-A01-03

Segment Reference Level

account sub

943-8-477

3

1

A26-A01-01

Business Segment Code

0

943-8-477

3

1

A26-A01-02

Business Segment Name

共通部門
Common department

943-8-477

3

1

A26-A01-04

Business Segment Type

部門
Business unit

Table 11 is mostly tidy, but observed value contains value of different data type. This is the result of the column Element containd different representation. Note that Tidy data doesn’t limit the number of columns for measured variables. Providing additional columns for each element’s variable results Table 12.

Table 12. Tidy data for General Ledger
d_A25
General Ledger
d_A26
General Ledger Line
d_A01
Business Segment
A25-01
Journal ID
A25-A26-08
Type Code
A25-A26-07
Journal Line Number
A25-A26-14
Bill Date
A25-A26-12
Bill Number
A25-A26-11
Source Code
A25-09
Header Description
A25-02
Journal Number
A26-A09-01
GL Account Number
A26-A09-02
GL Account Name
A26-A09-05
Account Type
A25-A26-27
Credit Debit Indicator
A26-A89-01
Amount
A26-A89-02
Currency
A25-A26-06
Effective Date
A26-A09-06
Account Subtype
A25-A26-10
Line Description
A26-A01-03
Segment Reference Level
A26-A01-01
Business Segment Code
A26-A01-02
Business Segment Name
A26-A01-04
Business Segment Type

943-8-477

943-8-477

entries

8-477

2009-04-03

manual

gj

General

134

943-8-477

1

754

事務用消耗品費
Office consumables expenses

account

debit

14858

JPY

2009-04-03

proposed

用紙
paper

943-8-477

1

1

account sub

10

札幌
Sapporo

部門
Business unit

943-8-477

2

191

仮払消費税等
Temporary payment consumption tax

account

debit

742

JPY

2009-04-03

tax

943-8-477

2

1

account sub

10

札幌
Sapporo

部門
Business unit

943-8-477

3

111

現金
Cash

account

credit

15600

JPY

2009-04-03

proposed

用紙
paper

943-8-477

3

1

account sub

0

共通部門
Common department

部門
Business unit

xBRL taxonomy and xBRL-CSV

xBRL taxonomy

XBRL US[_3] describes requirements for data in 1.3.1 XBRL Provides a Platform to Give Data Meaning

On the left side of Figure 1, there is a piece of primitive numeric data: “1234.” On its own, that data point only relays information to a reader if that recipient is already familiar with its context and purpose. For the purposes of this discussion, that data point can be referred to as a fact. In layers, meaning can be added to the fact by defining precision, scaling, units, and other similar information. Contextual meaning can also be supplied through another layer containing information such as time and location.
Finally, the fact can be identified with a concept. Depending on the semantic component of the standard being used to represent the data, this conceptual information could be quite detailed and include a wealth of additional meaning. As mentioned previously, structured data standards offer defined systems to represent data in this way, by adding layers of meaningful context that provide useful, pre-determined ways to interpret that data. XBRL is one such standard.

— XBRL US
XBRL Taxonomy Development Handbook A Guide for XBRL Taxonomy
thd v10 002
Figure 1. Possible types of information that could accompany a simple numeric data point
XBRL US[_3] describes data and semantic or contextual information of fact in 2.2.2 The Fact (An Intersection of Dimensions and Data)

In XBRL, a fact is the unique intersection of a set of XBRL dimensions with a data point. Figure 2 illustrates the basic structure of a fact. Arbitrary information, such as a number or a name or even a short section of text, has no semantic or contextual information in and of itself. Once XBRL dimensions, which add semantic information, intersect with that data point, it now becomes an XBRL fact.

— XBRL US
XBRL Taxonomy Development Handbook A Guide for XBRL Taxonomy
thd v10 007
Figure 2. A data point versus an XBRL fact

Table 13 shows the relationship between Tidy data terms and xBRL terms.
.Relation among Tidy data and xBRL

Tidy data xBRL

value

fact

variable

dimension

observation

data point

XBRL US[_3] describes how to specify facts using dimensions

The concepts would be the line items in Figure 3 (such as Food Expenses, Rent Expenses, and Utilities Expenses). The periods would be represented by the columns (January, February, and so forth). Again, each place where a concept dimension and period dimension intersect is a fact (in this case, a monetary amount located in the cell), and the combination of the contextual information provided by the concept and other dimensions (the period) create the XBRL dimensions of that fact.

— XBRL US
XBRL Taxonomy Development Handbook A Guide for XBRL Taxonomy
thd v10 011
Figure 3. Multiple concepts and XBRL dimensions intersecting to form multiple facts

xBRL-CSV

xBRL-CSV[_2] supports this shape of Tidy data.
The data I tried this time was converted from PCA Accounting 9V.2 journal entry data to XBRL-GL in 2003 (2829records 31MB).
Example xbrlgl/instances/0001-20091231-145-2109-1-4679.xml
The article page for XBRL-GL is https://www.sambuichi.jp/?p=345.

Converted with Arelle saveLoadableOIM (Table 11)

xBRL-CSV data adc-instances2-facts.csv (18MB)
xBRL-CSV meta data adc-instances2-metadata.json (1.1KB)

Tidy Data Conversion (Table 12) -Prototype-

xBRL-CSV data adc-instances.csv (2.4MB) 1/12 or more reduction
xBRL-CSV meta data adc-instances.json (4.6KB)

xBRL taxonomy

Schema core.xsd
Definition Linkbase core-def.xml
Presentation Linkbase core-pre.xml
Label Linkbase core-label-en.xml

Bibliography

[1] Tidy Data, Journal of Statistical Software, August 2014, Volume 59, Issue 10, Headley Wickham
https://www.jstatsoft.org/article/view/v059i10
[2] xBRL-CSV: CSV representation of XBRL data 1.0, Recommendation 13 October 2021
https://www.xbrl.org/Specification/xbrl-csv/REC-2021-10-13/xbrl-csv-REC-2021-10-13.html
[3] XBRL Taxonomy Development Handbook A Guide for XBRL Taxonomy, Developers First Edition (July 2020), XBRL US
https://xbrlus.github.io/docs/tdh.html#a_Toc45794887