
Configuration Reshape
Introduction
Tinyclues operates on a multi-tenant software architecture where clients share the same development environment and cloud instance. To accommodate varying client needs, configurations are specified in YAML files injected into the technical stack, providing client-specific settings.
Examples of Configurations:
AI Models: Different data volumes and features require specific parameters for model training.
International Currencies: Configurations determine which currency (EUR, USD, GBP, etc.) is displayed on the front-end.
Multiple technical teams (front-end, data engineering, data operations, data science, product) may create and manage these configurations. However, as new features and clients are added, the number of configuration files increases, making it challenging to maintain and debug these settings.
Problem Statement
To reduce technical debt and enhance the reliability of configuration files across technical stacks, the following measures should be implemented:
Standardized Format: Define and test configurations using a consistent format and set of rules for all clients/tenants.
Single Source of Truth: Centralize configuration access to ensure all teams use the latest version and avoid outdated or conflicting versions. This also prevents race conditions from simultaneous reads/writes by different stacks.
Improved Issue Detection: Develop better mechanisms to identify and debug configuration-related issues, which may affect individual tenants, multiple tenants, or all tenants within the software architecture.
Implementation
To reduce technical debt, enhance configuration management, and establish a single source of truth, a reshaping of the configurations was essential.
This included the following steps :
Define Configuration Files and Configuration Keys/Settings
Identified Key Configuration Files:
domain settings : Includes all critical product and business settings for each client.
training config : Contains parameters related to data science model training (e.g., number of epochs, model parameters)
evaluation config : Covers configurations for testing data science models before production deployment.
cleaned data config : Manages data cleaning processes specific to each client.
predictive schema : Details the data used for training, including columns, data duration, features, and model targets.
insights and analytics : Specifies client-specific settings for insights and analytics features in the product.
These configuration files are stored in a dedicated GitHub repository and must adhere to specific rules and constraints. Each time a configuration is modified, the CI pipeline executes tests to verify compliance with these rules before changes are accepted.
If the tests are successful, the configurations can be updated and merged into the master/main branch.
This process ensures that engineers produce high-quality configuration files that are compatible with the technical stack.
2. Testing Configuration Files for Format, Missing Keys, and Validation
Not all technical stacks require every configuration file or key. However, maintaining all configurations in a single location—a single source of truth—simplifies access to any necessary keys.
Implementation Steps:
Centralized Access: Keep all configuration files in one repository (a single source of truth) to facilitate access and management.
Flexible Usage: Allow technical stacks to select and use the specific configuration keys and settings they need, while having access to all available configurations. For example, ML pipelines have access to training config to train data science models, but can also access domain settings in the future if it is needed.
Merge Script: Implemented a Python script to merge all configuration files into a single file per client/tenant. The script will create one configuration file per client (tenant).
CI Integration: On each merge in the configuration repository, the CI pipeline triggers the merge for all tenants.
Deployment: The new merged configurations are pushed to a dedicated GCS bucket, replacing the previous versions.
3. Automatic Configuration Updates for All Tenants via CI Pipeline
Centralized Access: All components of the technical stack and tech teams access the configuration files from the dedicated GCS bucket.
Dynamic Retrieval: Front-end applications, data pipelines, and data science notebooks can retrieve the latest configuration files from this GCS bucket as needed.
Simplified Identity and Access Management: A dedicated service account ensures secure access to the configuration files, granting permissions only to trusted users.
Results
Standardized Configuration Files: Configuration files are now standardized with a specific structure and rules.
Single Source of Truth: Merging all configurations into a single file provides a unified source of truth accessible to all tech teams and technical components.
Secure and Robust Access: Storing configurations in a dedicated GCS bucket ensures secure, robust access, resolving issues like race conditions and guaranteeing that the latest version is available to all applications.
Enhanced Issue Diagnosis: Identifying and diagnosing configuration-related issues has become easier :
If the config.json is updated in the GCS bucket, the CI pipeline is healthy.
If the config.json is as expected, the problem lies withing the technical component using it. The relevant team (front-end, data science, data engineering) will need to add features or fix bugs to align with the latest version of the config file.
If the config.json does not meet expectations, but tests are successful, the data engineering team will add new tests or update the test scenarios to catch the issue, fix the configuration file in the repository, and Ensure that the CI pipeline tests the updated configuration and pushes the correct config.json file to GCS bucket.
If the config.json is not updated in the GCS bucket, the CI pipeline is broken.
The data engineer can fix the test or the operations engineer can rewrite the configuration file to parse the correct config.json to the GCS bucket.