[Config] refactoring to track set config values and unify config object with metadata
## Problem
The current configuration system has a tight coupling between `environment_config` and `config` that creates several problems:
1. **Lost context:** Once environment variables are parsed into the `config` object, we lose the connection to which environment variable was set by the user and what the value was. This makes it difficult to provide helpful error messages that reference specific configuration variables.
2. **Duplication:** We maintain two parallel structures - `environment_config` (with ENV values) and `config` (with typed values) - leading to redundant definitions and code. This makes a start on implementing deduplication.
3. **Difficult to extend:** Creating new, smaller config objects (like a bootstrap config for scan summary observer) requires either copying large amounts of tightly-coupled parsing code or duplicating the existing pattern.
4. **No traceability:** When errors occur during validation or at runtime (e.g., timeouts), we cannot easily trace back to the environment variable name to tell users which configuration to adjust.
5. **Complex parsing code:** The translation from `environment_config` to `config` involves complex intermediate code that's all strongly coupled to these two specific objects.
### Current architecture
```
Environment Variables
↓
environment_config (ENV values)
↓ (translation)
config (typed values)
↓ (context lost)
Usage (no reference to original env variable set)
```
## Proposed Solution
Replace the current split between `environment_config` and `config` with a unified configuration approach using a `ConfigValue` wrapper type that carries metadata alongside values.
### Architecture
```go
type ConfigValue[T any] struct {
Value T // The parsed, typed value
EnvName string // The environment variable name used (may be alias)
CanonicalName string // The documented/canonical variable name
OriginalValue string // The raw string value before parsing
}
```
### Single config object approach
Instead of two separate objects, have one config object where:
- Fields are defined as `ConfigValue[Type]` (e.g., `ConfigValue[time.Duration]`)
- Parsing happens inline with the field definitions
- Environment variable names are captured during parsing
- No translation phase needed between two separate structures
### Implementation Details to follow
**Current (two objects):**
**Proposed (single object):**
```go
type Config struct {
Username ConfigValue[string] `env:"DAST_AUTH_USERNAME"`
}
// Access value: config.Username.Value
// Access metadata: config.Username.EnvName, config.Username.CanonicalName
```
## Benefits
### Immediate benefits
1. **Better error messages:** Can reference specific environment variable names in all error messages, not just validation errors
```
Error: connection timeout exceeded
To adjust, set DAST_TARGET_CHECK_TIMEOUT (currently: 60s)
```
2. **Single source of truth:** One config object instead of two parallel structures
3. **Easier to create smaller configs:** Can create focused config objects for specific components without copying complex parsing machinery
4. **Runtime context:** Can reference configuration metadata at any point during execution, not just during parsing
### Long-term benefits
1. **Alias handling:** Can show canonical name in documentation links even when user used an alias
2. **Better help text:** Can programmatically generate help documentation from config struct
3. **Audit trail:** Can log which configuration values were explicitly set vs defaulted
4. **Debugging:** Can trace configuration decisions back to source
## Technical Approach (🚨 Accurate details will be in each individual issue)
### Initially: Prototype with new bootstrap config on subset of config
Start by implementing this approach for a new, minimal bootstrap configuration needed for scan summary observer and logger initialization:
1. Create `ConfigValue[T]` wrapper type (structure already POCed by @DavidNelsonGL in [this](https://gitlab.com/gitlab-org/security-products/analyzers/browserker/-/tree/config-value?ref_type=heads) branch)
2. Add field to `EnvValue` to pull in `envName`
3. Define bootstrap config struct with ConfigValue fields
4. Implement parsing logic that captures metadata
5. Test the approach in limited scope
**Example bootstrap config:**
```go
type BootstrapConfig struct {
TargetURL ConfigValue[string] `env_name:"DAST_TARGET_URL"`
}
```
**Draft acceptance criteria for MVC:**
- [ ] `ConfigValue[T]` type defined with value and metadata fields (initially just the current name not the alias)
- [ ] `EnvValue` enhanced to track which env var name was used
- [ ] Bootstrap config uses `ConfigValue` wrapper for all fields
- [ ] Parsing correctly populates both value and metadata
- [ ] Can access both `config.Field.Value()` and `config.Field.EnvName()`
- [ ] Error messages reference specific environment variable names
- [ ] No dependencies on existing `environment_config` or `config` objects
- [ ] `.Value()` accessor provides clean access to typed values
#### Notes
- Details may fluctuate as part of the POC, and I will update the comments and description with evolving details.
CC: @gitlab-org/secure/dynamic-analysis
<!-- start-discoto-summary -->
## Auto-Summary :robot:
<details>
<summary>Discoto Usage</summary>
---
> **Points**
>
> Discussion points are declared by headings, list items, and single
> lines that start with the text (case-insensitive) `point:`. For
> example, the following are all valid points:
>
> * `#### POINT: This is a point`
> * `* point: This is a point`
> * `+ Point: This is a point`
> * `- pOINT: This is a point`
> * `point: This is a **point**`
>
> Note that any markdown used in the point text will also be propagated
> into the topic summaries.
>
> **Topics**
>
> Topics can be stand-alone and contained within an issuable (epic,
> issue, MR), or can be inline.
>
> Inline topics are defined by creating a new thread (discussion)
> where the first line of the first comment is a heading that starts
> with (case-insensitive) `topic:`. For example, the following are all
> valid topics:
>
> * `# Topic: Inline discussion topic 1`
> * `## TOPIC: **{+A Green, bolded topic+}**`
> * `### tOpIc: Another topic`
>
> **Quick Actions**
>
> | Action | Description |
> |-------------------------------|---------------------------------------------------------|
> | `/discuss sub-topic TITLE` | Create an issue for a sub-topic. Does not work in epics |
> | `/discuss link ISSUABLE-LINK` | Link an issuable as a child of this discussion |
>
---
</details>
Last updated by [this job](https://gitlab.com/theoretick/discoto-runner/-/jobs/12117722609)
* **TOPIC** Part A - What Data Should `ConfigValue[T]` Contain? :thread: https://gitlab.com/groups/gitlab-org/-/issues/19992#note_2893453361
* **TOPIC** Part B - How to Implement the Migration :thread: https://gitlab.com/groups/gitlab-org/-/issues/19992#note_2894048115
<!-- end-discoto-summary -->
<!-- start-discoto-topic-settings --><details>
<summary>Discoto Settings</summary>
<br/>
```yaml
---
summary:
max_items: -1
sort_by: created
sort_direction: ascending
```
See the [settings schema](https://gitlab.com/gitlab-org/secure/pocs/discoto#settings-schema) for details.
</details>
<!-- end-discoto-topic-settings -->
epic