Streamlining UML Diagram Automation for ISO 21378 Audit Data Collection

Visits: 25

Nobuyuki SAMBUICHI
ISO/TC295 Audit data services
Convener at SG1 Semantic model
Co-project Leader at AWI 21926 Semantic data model for audit data services

1. Introduction

In today’s digital world, visual representation of data models is crucial. This blog explains how to automate UML (Unified Modeling Language) diagrams creation from CSV (Comma Separated Values) files using Python and PlantUML.

2. Objective

Our primary goal is to transform class information stored in CSV files into structured UML diagrams, useful for software developers and system architects.

3. Tool and Environment Setup

To begin, ensure you have Python installed. Then, set up PlantUML:

Download PlantUML: Go to PlantUML official website and download plantuml.jar.
Install Java: PlantUML requires Java. Download and install the Java Runtime Environment (JRE).
Verify PlantUML Setup: Run java -jar path/to/plantuml.jar -version in your terminal. Replace path/to/plantuml.jar with your PlantUML jar file path.

4. Building the Python Script and Its Functions

The Python script is designed to automate the UML diagram creation process by performing several key functions:

Path Conversion: Converts relative paths to absolute paths for reliable file access.
Reading Target Classes: Reads specific class names from target_classes_file to selectively generate UML diagrams based on user-defined criteria.
Parsing CSV Data: Extracts essential class properties and relationships from object_class.
Generating PlantUML Code: Transforms extracted data into PlantUML syntax, setting the stage for diagram generation.
Exporting to UML Diagram: Writes the PlantUML code to a file and employs PlantUML to convert this code into a visual UML diagram.

This script, by processing only relevant data from mim.csv and target_classes_file, efficiently creates focused and accurate UML diagrams, enhancing the understanding and representation of complex semantic models.

5. Understanding the CSV Input Format

5.1. object_class

The object_class file, used as input, is a Refined Message Information Model (R-MIM) for HL7 graphwalk, aimed at producing Hierarchical Message Definition (HMD). The file contains the following columns:

ID: A unique identifier for each entry.
Type: The type of the entry, such as attribute, association, etc.
Level: The hierarchical level of the entry in the model.
Multiplicity: Defines the multiplicity of the relationship.
ClassName: The name of the class.
AttributeName: The name of the attribute.
Datatype: The datatype of the attribute.
AssociatedClass: Any class that is associated with the current entry.
Description: A brief description of the entry.

It’s important to note that the Python script utilizes only a subset of these columns – specifically ClassName, AttributeName, Type, Datatype, and AssociatedClass, to construct the UML diagrams.

Ensure the CSV file follows this format for the script to parse and generate UML diagrams correctly.

The Python script specifically processes the following columns from object_class file:

ClassName: Utilized to identify and create UML classes.
AttributeName: Used to define attributes within each UML class.
Type: Indicates the nature of the entry, whether it’s an attribute or a type of relationship (like association or aggregation).
Datatype: Specifies the data type of attributes in the UML class.
AssociatedClass: Important for defining relationships between classes, especially in cases of associations, aggregations, and compositions.

These columns are essential for accurately mapping the R-MIM data into a structured UML diagram, capturing the necessary details for each class and their inter-relationships.

5.2. target_classes_file

The object_class file, integral to this semantic model, encompasses a range of classes related to the Refined Message Information Model (R-MIM) for HL7. To selectively generate UML class diagrams, the target_classes_file is utilized. This file contains a list of class names, with each class name written on a separate row. The Python script reads this file to filter and process only the specified classes from mim.csv, enabling focused and relevant UML diagram generation based on the user’s requirements.

6. Usage

Prepare your CSV with class information, then run the Python script to generate the UML diagram.

7. Conclusion

This automation simplifies creating UML diagrams from CSV files, showcasing the synergy between Python’s simplicity and PlantUML’s efficiency.