C: Data Architecture
1: Approach
1.1: Data Structure
Data Structure
Data Architecture Capabilities
A Data Architecture should handle the following types of data:
- Data at Rest: Data stored in databases or data warehouses.
- Data in Motion: Data being processed in transactions or exchanged via services/APIs.
- Data in Use: Data being actively utilized at the application's interface (e.g., Graphical User Interface - GUI).
- Open Data: Publicly accessible data that the organization is required or chooses to share.
Data Architecture Metamodel Entities
Data Architecture is created using three metamodel entities:
- Data Entity: Represents a conceptual data model to aid developers in understanding the concepts they will work with. This can include relationships and constraints (e.g., "a customer can only have one address").
- Logical Data Component: Utilized to create logical data models that give a clear overview of all data used within the IT environment. This model serves as a requirement for data stored in applications (at rest), transferred between applications (in motion), or displayed in user interfaces (data in use).
- Physical Data Component: Clusters of logical data components implemented through previous projects, such as XML messages or database schemas, or requirements for new implementations.
Data Exchange Models
All three data entities can be utilized in models that describe data exchanges between:
- Information System (IS) services
- Logical application components
- Physical application components
Quality Attributes
Each data entity can possess quality attributes tailored for specific situations, ensuring data integrity and reliability.
This table encapsulates the essential aspects of Data Architecture, detailing the types of data it should handle and the metamodel entities involved in its construction.
1.2: Data Management Considerations
Data Management Considerations
Importance of Data Management
A structured and comprehensive approach to data management is essential for effectively leveraging data to gain competitive advantages during architectural transformations.
Key Considerations
1. System of Record: Define which application components will serve as the system of record or reference for enterprise master data. 2. Enterprise-Wide Standards: Determine if there will be an enterprise-wide standard that all application components, including software packages, must adopt. 3. Data Utilization: Clearly understand how data entities are utilized by business capabilities, functions, processes, and services. 4. Data Lifecycle Understanding: Clearly map out how and where enterprise data entities are created, stored, transported, and reported. 5. Data Transformation Complexity: Assess the level and complexity of data transformations required to support information exchange needs between applications. 6. Integration Requirements: Identify requirements for software tools to support data integration with customers and suppliers, such as: - ETL (Extract, Transform, Load) tools for data migration - Data profiling tools to evaluate data quality
Additional Guidance
Further insights on data management can be found in the TOGAF® Series Guide: Information Architecture — Customer Master Data Management.
This table highlights the essential aspects of Data Management, emphasizing the significance of structured data handling and the considerations that organizations should take into account during architectural transformations.
1.3: Data Migration Considerations
Data Migration Considerations
When replacing an existing application, the need for effective data migration is crucial. Here’s a structured overview of the key considerations related to data migration as part of the Data Architecture:
Aspect
Details
Data Types to Migrate
- Master Data: Core data that is essential for operations, such as customer and product information. - Transactional Data: Data related to business transactions. - Reference Data: Supporting data that provides context, like codes and classifications.
Migration Requirements
- Identify specific data migration requirements based on the target application’s needs. - Establish a clear understanding of data to be migrated and its sources.
Transformation Needs
- Assess the level of transformation required to adapt the existing data for the new application. - Identify the necessary adjustments to format and structure.
Data Quality Considerations
- Implement processes for data cleansing and weeding to ensure the target application is populated with high-quality data. - Conduct data validation checks to confirm that migrated data meets quality standards.
Common Data Definitions
- Establish enterprise-wide common data definitions to standardize data across the organization. - Ensure that all stakeholders agree on the definitions to support successful data transformation.
Objectives of Data Migration
- Ensure that the target application is populated with accurate and complete data. - Minimize disruptions to business operations during the migration process.
The successful migration of data during application replacement hinges on clear identification of migration requirements, understanding the transformation needed, ensuring high data quality, and establishing common data definitions. By addressing these considerations, organizations can facilitate a smoother transition to new applications while maintaining the integrity and usability of their data.
1.4 Data Governance Considerations
Effective data governance is crucial for ensuring that an enterprise can manage and transform its data assets effectively. The following dimensions outline the key components of a comprehensive data governance framework:
Dimension
Details
Structure
- Organizational Structure: Establish clear roles and responsibilities for data governance within the enterprise. - Standards Bodies: Identify or create committees or teams that set and enforce data management standards and policies. - Data Stewardship: Assign data stewards to oversee specific data domains, ensuring data quality and compliance.
Management System
- Data Governance Framework: Develop a framework that outlines policies, processes, and tools for managing data throughout its lifecycle. - Programs and Initiatives: Implement data governance programs that address data quality, data management, and compliance with regulatory requirements. - Lifecycle Management: Ensure that there are processes for data creation, usage, maintenance, and retirement that are adhered to across the organization.
People
- Skill Identification: Assess the current skill sets of employees and identify gaps in data-related competencies. - Role Definition: Clearly define roles related to data governance, including data owners, data stewards, and data analysts. - Training and Development: Develop and implement training programs to equip existing staff with necessary data management skills or consider hiring new talent with the required expertise.
Data governance is a multidimensional approach that encompasses organizational structure, management systems, and the people involved in managing data. For successful transformation, enterprises must ensure they have the appropriate structures and resources in place, which may involve acquiring new skills or training existing staff. This holistic approach to data governance will empower the enterprise to leverage its data assets effectively and maintain data integrity and compliance throughout the transformation process.
1.5 : Architecture Repository Considerations
The Architecture Repository serves as a central resource for managing architectural artifacts, including Data Architecture components. Within this context, the architecture team should focus on the following aspects:
1. Identifying Relevant Data Architecture Resources
Generic Data Models: Review existing generic data models that are applicable to the organization's industry vertical. These models can provide a foundational framework that can be customized to meet specific business needs.
Enterprise Data Models: Look for existing enterprise-wide data models that capture the overall data landscape, including key data entities, relationships, and data flows within the organization.
Reference Models: Identify reference models that may outline industry best practices or standards, which can guide the development and alignment of the Data Architecture with industry norms.
2. Utilizing the Architecture Repository
Centralized Documentation: Ensure that all data-related artifacts, such as data catalogs, matrices, and diagrams, are documented and stored within the Architecture Repository for easy access and management.
Version Control: Implement version control mechanisms to track changes and updates to data architecture artifacts, ensuring that all stakeholders are working with the latest information.
Interoperability: Establish links between the Data Architecture resources and other architectural domains (e.g., Business Architecture, Application Architecture) to facilitate a holistic understanding of how data integrates with other components.
3. Integration with Organizational Knowledge
Best Practices: Leverage best practices from previous projects documented in the repository to inform the development of the Data Architecture.
Lessons Learned: Review lessons learned from past architectural efforts, which can provide insights into potential pitfalls and successful strategies for data management and governance.
By thoroughly exploring and leveraging the resources available in the Architecture Repository, the architecture team can enhance the development and implementation of a robust Data Architecture. This approach not only helps in aligning with industry standards but also fosters effective data management and governance practices within the organization.
2: Steps
1: Selecting Reference Models, Viewpoints, and Tools
Selecting Reference Models, Viewpoints, and Tools
Review Data Principles
- Validate or generate a set of data principles that align with overarching Architecture Principles. - Use TOGAF Standard ADM Techniques for guidelines on principles and sample data principles.
Select Data Architecture Resources
- Choose relevant Data Architecture resources such as reference models and patterns based on business drivers, stakeholders, concerns, and Business Architecture.
Select Data Architecture Viewpoints
- Identify relevant viewpoints to address stakeholder concerns, such as regulatory bodies, users, time dimensions (real-time, event-driven, reporting period), and business processes.
Identify Tools and Techniques
- Choose appropriate tools for data capture, modeling, and analysis. Tools may range from simple documents or spreadsheets to more advanced data management and modeling techniques.
Examples of Data Modeling Techniques
- Entity relationship diagram - Class diagram
Further Guidance
- TOGAF® Series Guide: Information Architecture — Customer Master Data Management - The Open Group Guide: Information Architecture: Business Intelligence & Analytics and Metadata Management Reference Models
1.1 Determining the Overall Modeling Process
Determining the Overall Modeling Process
Select Models for Each Viewpoint
- For each identified viewpoint, choose the appropriate models needed to support that specific view using the selected tools or methods.
Address Stakeholder Concerns
- Ensure that all stakeholder concerns are addressed. If any concerns are not covered, either create new models or enhance existing ones to fill those gaps.
Recommended Process for Developing Data Architecture
1. Collect Data-Related Models
- Gather data-related models from existing Business Architecture and Application Architecture materials.
2. Rationalize Data Requirements
- Align data requirements with existing enterprise data catalogs and models to develop a comprehensive data inventory and entity relationship model.
3. Update and Develop Matrices
- Create and update matrices across the architecture to relate data to business services, capabilities, functions, access rights, and applications.
4. Elaborate Data Architecture Views
- Examine how data is created, distributed, migrated, secured, and archived to provide detailed Data Architecture views.
This table outlines the steps involved in determining the overall modeling process for Data Architecture, emphasizing the importance of stakeholder engagement and thorough data management practices.
1.2 Identifying Required Catalogs of Data Building Block
Identifying Required Catalogs of Data Building Blocks
Data Catalog Overview
- Capture descriptions of data in a catalog that shows the decomposition of related model entities (e.g., from data entity to logical data component to physical data component).
Prerequisite Diagram
- During the Business Architecture phase, a Business Service/Information diagram was created, illustrating the key data entities required by the main business services. This serves as a prerequisite for effective Data Architecture activities.
Traceability for Data Inventory
- Utilize traceability from business function/business capability to application and data entity to create an inventory of the data necessary to support the Architecture Vision.
Consolidate Data Requirements
- Once data requirements are consolidated in one location, refine the data inventory to achieve semantic consistency and eliminate gaps and overlaps.
Reference TOGAF Standard
- The TOGAF Standard — Architecture Content provides a detailed description of catalogs to consider for development within Data Architecture, relating them to entities, attributes, and relationships in the TOGAF Enterprise Metamodel.
This table outlines the process of identifying and cataloging data building blocks, emphasizing the importance of traceability, consolidation, and reference to established standards.
1.3 : Identifying Required Matrices
Identifying Required Matrices
Entity to Applications Matrix
- At this stage, create an entity to applications matrix to validate the mapping of data. This matrix will help in understanding how data is created, maintained, transformed, and utilized by various applications.
Gap Identification
- Note gaps such as entities that are never created by any application or data that is created but never utilized. These gaps should be documented for later gap analysis.
Update Architectural Diagrams
- Utilize the rationalized data inventory to update and refine architectural diagrams that illustrate how data relates to other components of the architecture.
Short Iteration of Application Architecture
- After updates are made, consider conducting a brief iteration of the Application Architecture to address the identified changes.
Reference TOGAF Standard
- The TOGAF Standard — Architecture Content provides detailed descriptions of matrices that should be developed within Data Architecture, linking them to entities, attributes, and relationships in the TOGAF Enterprise Metamodel.
This table summarizes the steps involved in identifying and creating necessary matrices in Data Architecture, emphasizing the importance of validating mappings, identifying gaps, and updating architectural diagrams.
1.4: Identifying Required Diagrams
Identifying Required Diagrams
Purpose of Diagrams
- Diagrams present Data Architecture information from various perspectives (viewpoints) based on stakeholder requirements.
Relationship Diagram
- After refining data entities, create a diagram illustrating the relationships between entities and their attributes.
Data Sources
- Recognize that the information may include a mix of enterprise-level data (from system service providers and vendor information) and local-level data (held in personal databases and spreadsheets).
Assessment of Detail Level
- Carefully assess the level of detail in modeling; some physical system data models may include very detailed levels, while others may only model core entities.
Updating Data Models
- Be aware that not all data models will have been updated as applications were modified and extended, highlighting the need for a balance in detail.
Balancing Detail
- Aim for a balance between reproducing existing detailed system physical data schemas and presenting high-level process maps with data requirements.
Reference TOGAF Standard
- The TOGAF Standard — Architecture Content offers a detailed description of diagrams to consider for development within Data Architecture, relating them to entities, attributes, and relationships in the TOGAF Enterprise Metamodel.
This table outlines the process of identifying required diagrams in Data Architecture, emphasizing the need for various perspectives, careful assessment of detail, and balance in data modeling.
1.5 : Identifying Types of Requirements to be Collected
Identifying Types of Requirements to be Collected
Formalizing Requirements
- After developing Data Architecture catalogs, matrices, and diagrams, the next step is to formalize the data-focused requirements for implementing the Target Architecture.
Types of Requirements
- The following types of requirements should be identified:
1. Data Domain Requirements
- Requirements that specifically relate to the data domain, detailing what data is needed and how it should be managed.
2. Application Architecture Input
- Requirements that provide essential input into the Application Architecture, ensuring alignment with the overall architecture goals.
3. Technology Architecture Input
- Requirements that inform the Technology Architecture, guiding technology decisions and integrations.
4. Design and Implementation Guidance
- Detailed guidance to be reflected during design and implementation phases, ensuring that the solution effectively addresses the original architecture requirements.
Role of the Architect
- The architect is responsible for identifying and ensuring that all relevant requirements are met by the architecture.
This table summarizes the process of identifying and formalizing requirements necessary for implementing the Target Architecture within Data Architecture.
2: Developing Baseline Data Architecture Description
Developing Baseline Data Architecture Description
Purpose of Baseline Description
- Create a Baseline Description of the existing Data Architecture to support the development of the Target Data Architecture.
Scope and Level of Detail
- The scope and detail will depend on the extent of existing data elements to be retained in the Target Data Architecture and the availability of existing architectural descriptions.
Identification of Building Blocks
- Identify relevant Data Architecture building blocks, utilizing the Architecture Repository as a resource (refer to the TOGAF Standard — Architecture Content).
Developing New Models
- If new architecture models are necessary to address stakeholder concerns, use the models identified in previous steps as a guideline for creating the new architecture content that describes the Baseline Architecture.
This table outlines the key steps and considerations for developing a Baseline Data Architecture Description, emphasizing the importance of identifying relevant building blocks and addressing stakeholder concerns through appropriate models.
3: Developing Target Data Architecture Description
Developing Target Data Architecture Description
Purpose of Target Description
- Create a Target Description for the Data Architecture to support the Architecture Vision and Target Business Architecture.
Scope and Level of Detail
- The scope and detail will depend on how relevant the data elements are for achieving the Target Architecture and the availability of existing architectural descriptions.
Identification of Building Blocks
- Identify relevant Data Architecture building blocks, using the Architecture Repository as a resource (refer to the TOGAF Standard — Architecture Content).
Developing New Models
- If new architecture models are required to address stakeholder concerns, use models identified in previous steps as a guideline for creating the new architecture content that describes the Target Architecture.
Exploration of Alternatives
- Investigate different Target Architecture alternatives and discuss these with stakeholders using the Architecture Alternatives and Trade-offs technique (see the TOGAF Standard — ADM Techniques).
This table summarizes the key steps and considerations for developing a Target Data Architecture Description, focusing on the identification of relevant building blocks and engaging stakeholders in discussions about architectural alternatives.
4: Performing Gap Analysis
Performing Gap Analysis
Purpose
- Verify the architecture models for internal consistency and accuracy.
Steps
- Trade-off Analysis: Perform trade-off analysis to resolve any conflicts among different views.
- Model Validation: Ensure that the models align with the established principles, objectives, and constraints.
- Document Changes: Note any changes to the viewpoints represented in the selected models from the Architecture Repository and document them.
- Completeness Testing: Test the architecture models for completeness against requirements.
Gap Identification
- Identify gaps between the Baseline and Target Architectures using the gap analysis technique as outlined in the TOGAF Standard — ADM Techniques.
This table outlines the essential activities involved in performing a gap analysis, focusing on verifying architecture models and identifying discrepancies between baseline and target states.
5: Defining Candidate Roadmap Components
Defining Candidate Roadmap Components
Purpose
- To prioritize activities for the upcoming phases based on the Baseline Architecture, Target Architecture, and gap analysis.
Outcome
- Creation of an initial Data Architecture roadmap.
Next Steps
- Use the initial roadmap as raw material for developing a more detailed, consolidated, cross-discipline roadmap during the Opportunities & Solutions phase.
This table summarizes the process of defining candidate roadmap components for the Data Architecture, emphasizing the importance of prioritization and future integration into a broader roadmap.
6: Resolving Impacts Across the Architecture Landscape
Resolving Impacts Across the Architecture Landscape
Purpose
- To assess the wider impacts and implications of the finalized Data Architecture on the overall architecture landscape.
Key Considerations
- Examine other architecture artifacts to identify potential impacts and opportunities.
Questions to Address
1. Does this Data Architecture create an impact on any pre-existing architectures?
2. Have recent changes been made that impact the Data Architecture?
3. Are there opportunities to leverage work from this Data Architecture in other areas of the organization?
4. Does this Data Architecture impact other projects (including planned and in-progress projects)?
5. Will this Data Architecture be impacted by other projects (including planned and in-progress projects)?
Outcome
- A comprehensive understanding of the implications of the Data Architecture, allowing for better decision-making and alignment with organizational goals.
This table outlines the process for resolving impacts across the architecture landscape, highlighting essential questions and considerations to ensure a holistic view of the Data Architecture's effects.
7: Conduct Formal Stakeholder Review
Conduct Formal Stakeholder Review
Purpose
- To validate the proposed Data Architecture against original project motivations and the Statement of Architecture Work.
Key Activities
1. Check Original Motivation: Review the architecture project's original goals and objectives.
2. Impact Analysis: Analyze the proposed Data Architecture for potential impacts on Business and Application Architectures. Identify changes needed in business practices, forms, procedures, applications, or database systems.
3. Revisit Architectures: If significant impacts are identified, consider revisiting the Business and Application Architectures.
4. Application Architecture Adjustments: Identify necessary changes in the Application Architecture to accommodate the new Data Architecture or to impose constraints on the design. If significant, initiate a short iteration of the Application Architecture.
5. Technology Architecture Constraints: Identify any constraints that the proposed Data Architecture imposes on the Technology Architecture being designed.
Outcome
- A comprehensive understanding of the impacts of the proposed Data Architecture, leading to informed adjustments in related architectures as necessary.
This table summarizes the steps involved in conducting a formal stakeholder review of the proposed Data Architecture, ensuring alignment with project objectives and identifying necessary adjustments across associated architectures.
8: Finalize the Data Architecture
Finalize the Data Architecture
Purpose
- To complete and formalize the Data Architecture, ensuring it meets all requirements and standards.
Key Activities
1. Select Standards: Choose appropriate standards for each building block, reusing elements from the selected reference models in the Architecture Repository.
2. Document Building Blocks: Fully document each building block, including specifications, functionalities, and interrelations.
3. Cross-Check Architecture: Conduct a final review of the overall architecture against business requirements, ensuring alignment. Document the rationale for decisions regarding building blocks in the architecture documentation.
4. Requirements Traceability Report: Prepare and document the final requirements traceability report to track how each requirement is addressed by the architecture.
5. Final Mapping: Document the final mapping of the architecture within the Architecture Repository. Identify and publish any building blocks that may be reused in future projects.
6. Finalize Work Products: Complete and finalize all relevant work products, including the gap analysis, ensuring they are updated to reflect the final architecture.
Outcome
- A finalized Data Architecture that is well-documented, aligned with business requirements, and ready for implementation, with all relevant work products completed and published.
This table outlines the essential steps required to finalize the Data Architecture, emphasizing documentation, standardization, and alignment with business objectives.
9: Create/Update the Architecture Definition Document
Create/Update the Architecture Definition Document
Purpose
- To compile and update the Architecture Definition Document with detailed Data Architecture sections and the rationale behind building block decisions.
Key Activities
1. Document Rationale: Provide clear reasoning for decisions made regarding building blocks, ensuring alignment with architectural goals and stakeholder needs.
2. Prepare Data Architecture Sections: Develop the following sections for the Architecture Definition Document:
- Business Data Model: Outline the high-level representation of data relevant to the business processes.
- Logical Data Model: Create a detailed logical representation of data structures and their relationships.
- Data Management Process Model: Document processes for data governance, quality, and lifecycle management.
- Data Entity/Business Function Matrix: Construct a matrix mapping data entities to business functions to illustrate relationships.
- Data Interoperability Requirements: Specify requirements for data exchange, including formats (e.g., XML schema) and security policies.
3. Utilize Reports/Graphics: If relevant, incorporate visual representations generated by modeling tools to illustrate key views of the architecture.
4. Review and Feedback: Route the completed document for review by relevant stakeholders and incorporate their feedback to ensure accuracy and completeness.
Outcome
- An updated Architecture Definition Document with comprehensive Data Architecture sections that reflect building block decisions, stakeholder feedback, and critical architectural views.
This table outlines the steps needed to create or update the Architecture Definition Document, ensuring it captures all essential elements of the Data Architecture and the rationale for decisions made during its development.
Last updated