Guideline for Analyzing Competency Questions
This guideline outlines the process for working with Competency Questions.
1- Pre-processing Phase
In this phase, we establish the foundation for handling Competency Questions, ensuring data privacy, clarity, and organization. This ensures effective analysis and use of the questions.
1-1- Anonymizing and Preserving Privacy
To maintain data privacy and confidentiality, we anonymize personal identifiers of each stakeholder. This involves replacing actual names with labels such as 'User 1' or 'Stakeholder 1.' However, we will keep a separate record of the original identities for potential future reference.
1-2- Data Protection Regulations
Following data protection regulations, we include a statement in the header of each pad associated with each Competency Question’s stakeholder. This informs participants that their information will be processed anonymously and emphasizes compliance with regulations such as GDPR. If necessary, we can provide a data privacy statement for participants to acknowledge and agree upon.
1-3- Assigning Unique IDs
For effective tracking and management, each Competency Question is assigned a unique ID. These IDs, such as CQ1, CQ2, etc., facilitate organization and future reference. All the gathered CQs are combined into one file to make it easier to find and use them.
1-4- Masking Sensitive Data
Where applicable, we mask or alter specific sensitive details within CQs, such as numbers, dates, or locations, to ensure the main meaning and purpose of the questions remain intact.
1-5- Ensuring CQ Understandability
To ensure the questions are clear and understandable for our working group, we rephrase domain-specific terms and abbreviations and make necessary adjustments. This step enhances clarity and readability, aligning the questions with the group's expertise.
1-6- Quality Assurance of Initial CQs
Before proceeding, we review the collected Competency Questions to ensure they are correctly framed and aligned with the project's objectives.
1-7- Documenting Modifications
As part of maintaining transparency and accountability, we maintain a detailed log of all modifications made during the anonymization and masking processes. This documentation acts as a point of reference and helps maintain the accuracy of the CQs.
2- Organizing and Prioritizing Phase
In this phase, we focus on organizing and prioritizing Competency Questions.
2-1- Checking for Overlap or Redundancy
We examine the collected CQs to identify any redundancy or overlap among them.
2-2- Prioritizing Based on Relevance and Importance
We prioritize the CQs based on their relevance to the project's goals and their overall importance.
2-3- Create Consolidated Questions
Write new questions based on the old ones but splitting, unifying, ... them according to the overall understanding of information needs.
2-4- Analyzing Gaps and Coverage
Our analysis will focus on whether the collected CQs cover a comprehensive range of topics or if there are significant gaps in coverage. If we choose an iterative process for gathering and analyzing CQs, addressing this step may not be straightforward, especially with respect to missing stakeholder responses (e.g. due to them not being available or not being approached so far).
2-5- Documenting the Process
We ensure that all processes conducted in this phase are documented.
2-6- Defining Required Categories
We decide on the necessary categories, considering options such as categories discussed during the second openDVA Congress or categories based on the GerPS ontology. Additionally, we explore grouping CQs based on their similarity and identifying common topics among stakeholders. If possible, subcategories will be defined within each category.
3- Reflection and Usage of CQs in KG
In this phase, we reflect on and utilize the Competency Questions in the Knowledge Graph context (Linking CQs with Knowledge Graph).
3-1- Identifying Classes and Overall Hierarchy
From the CQs, identify relevant classes and an initial hierarchy.
3-2- Identifying Data Sources for Reuse within the KG
Identify terminologies, ontologies, and other data sources to be reused in the final graph. These are based on the important topics derived from the CQs. (Will likely make us revise some of the decisions made in 3-1.)
3-3- Identifying Key Attributes and Properties
Our focus will be on identifying the necessary attributes or properties within each class of the ontology/KG that hold an important role in addressing the CQs.
3-4- Preparing for Integration into KG
We detail the process of integrating the identified attributes and properties of CQs into the Knowledge Graph, ensuring effective linkage and representation.
4- Ontology Development and Knowledge Graph Modeling Phase
In this phase, we use the insights gained from the Competency Questions to design and build a Knowledge Graph (KG) that effectively represents the domain's interests.
4-1- Defining Classes and Relationships
We identify the entities (classes) and relationships that will form the foundation of our Knowledge Graph. We map these elements to the categories and concepts derived from the Competency Questions. Whereever possible, existing vocabularies, terminologies, and ontologies will be reused.
4-2- Mapping Attribute and Property
We integrate the attributes and properties we identified in the Competency Questions to the corresponding entities in the KG. This ensures alignment between the collected information and the KG's structure.
4-3- Ontology Population
We populate our KG with real instances to enrich its content.
4-4- KG Validation and Testing
We validate and test the Knowledge Graph based on a set of quality criteria to verify that it accurately reflects the insights obtained from the Competency Questions. Example answers to the CQs will be provided as A-Box instances. Accordingly SPARQL queries will be created for each CQ to verfiy that the KG can answer the questions posed to it.
4-5- Iterative Refinement
We consider an iterative approach to refine the KG based on feedback and further analysis.