Skip to content

docs: add Kubernetes troubleshooting guide#393

Merged
markphelps merged 5 commits intomainfrom
docs/k8s-troubleshooting
Feb 26, 2026
Merged

docs: add Kubernetes troubleshooting guide#393
markphelps merged 5 commits intomainfrom
docs/k8s-troubleshooting

Conversation

@markphelps
Copy link
Contributor

@markphelps markphelps commented Feb 25, 2026

Summary

Adds a comprehensive Kubernetes troubleshooting guide based on real deployment issues encountered by users deploying Flipt v2 to GKE and other containerd-based clusters.

Sections

1. Helm Schema Validation Errors

--skip-schema-validation workaround for overly strict JSON schema.

2. Machine ID Not Found (License Validation)

Explains why machineid: no machine-id found occurs on containerd clusters and provides the hostPath volume mount fix. Lists affected environments (GKE, EKS, Fargate, Cloud Run).

3. Persistent Volume Configuration

Covers enabling persistence, matching storage backend paths to PVC mount points, and the common mistake of using relative paths.

4. Secrets Management with GitOps

Full walkthrough of the envFrom + ${env:VAR_NAME} pattern for keeping secrets out of values.yaml. Covers External Secrets Operator, Sealed Secrets, ArgoCD Vault Plugin, and manual kubectl approaches.

Navigation

Added to the Deployment group in the v2 sidebar, directly after "Deploy to Kubernetes".

Related Issues

Mintlify

0 threads from 0 users in Mintlify

  • No unresolved comments

Open in Mintlify Editor

Cover common issues encountered deploying Flipt v2 to Kubernetes:
- Helm schema validation errors (--skip-schema-validation)
- Machine ID not found on containerd clusters (GKE, EKS)
- Persistent volume configuration and storage backend paths
- Secrets management with GitOps (ArgoCD, FluxCD)
- flipt license check vs Helm values confusion

Based on real deployment issues from community support.
@claude
Copy link
Contributor

claude bot commented Feb 25, 2026

PR Review: Kubernetes Troubleshooting Guide

This is a well-structured, practical guide that addresses real user pain points. The content is clearly grounded in actual deployment issues, and the step-by-step solutions in the Secrets Management section are excellent. A few issues to address before merging:


Required: Missing language tags on code blocks

Per the project writing guidelines, all code blocks need language tags. Two blocks are missing them:

Line 12 — the Helm error output block has no language tag; it should use text.

Line 45 — the license error output block has no language tag; it should also use text.


Required: Section heading casing (line 215)

The heading ## flipt license check and Helm Values starts with lowercase, inconsistent with every other heading in the document. Since flipt license check is a command name, wrapping it in backticks fixes both issues.


Recommended: Cloud Run is not Kubernetes (line 95)

The "Affected Environments" list includes "Serverless container runtimes (AWS Fargate, Google Cloud Run)". Google Cloud Run is not Kubernetes and does not use the Helm chart — Cloud Run users would never encounter this issue via this path. AWS Fargate is only relevant through Amazon EKS, which is already listed. This item may confuse readers.

Suggested fix: remove "Google Cloud Run" and clarify Fargate is EKS-specific, e.g. "Amazon EKS with AWS Fargate node groups".


Recommended: The schema Note will go stale (lines 36–38)

"The Helm chart schema is being updated to cover all valid configurations." — once the schema is fixed upstream, this sentence becomes outdated. Consider more evergreen phrasing, or link to the helm-charts issue so users can track progress themselves.


Recommended: Verify envFrom and extraEnvVars are in the current chart

The PR description references helm-charts#269 and helm-charts#270 as related but still-open issues. Please confirm that flipt.envFrom and flipt.extraEnvVars are supported in the currently released Helm chart. If not yet released, note the minimum chart version required, or hold those sections until the chart ships.


Minor: --skip-schema-validation requires Helm 3.13+

This flag was introduced in Helm 3.13. Users on older releases will get an "unknown flag" error. A brief note on the minimum required version would help.


Overall

The content quality is high and this will be genuinely useful for users hitting these issues. The missing language tags and heading casing are the only hard blockers per the project writing guidelines. Everything else is recommended for accuracy and longevity.

Apply suggested changes from PR review: add missing language tags on
code blocks, fix heading casing, remove Cloud Run from affected
environments, make schema note evergreen, add Helm version requirement.
Also link to the troubleshooting guide from the deploy guide and
installation page.

Signed-off-by: Mark Phelps <209477+markphelps@users.noreply.github.com>
@claude
Copy link
Contributor

claude bot commented Feb 25, 2026

Review

This is a well-structured guide addressing real deployment pain points. The content is practical, the Mintlify components (Note, Warning, Tip) are used appropriately, and all code blocks have language tags. Navigation integration and cross-referencing are correct. A few things worth addressing before publishing:


Issues

1. Helm values path inconsistency

The volume mount snippet places keys at the top level of values.yaml:

extraVolumeMounts:
  - name: machine-id
    ...
extraVolumes:
  - name: machine-id
    ...

But the secrets section places keys under flipt::

flipt:
  envFrom:
    - secretRef: ...
  extraEnvVars:
    - name: FLIPT_LICENSE_KEY
      ...

If both are intended for the same values.yaml, this inconsistency will confuse readers. Please verify the correct nesting for each key against the Helm chart's actual values.yaml schema and make the examples consistent.


2. kubectl exec deployment name is implicit

The command uses deploy/flipt-v2, which is the chart's default release name, but users often choose their own. A short inline comment such as # Replace flipt-v2 with your release name would prevent confusion.


3. The --skip-schema-validation note is too absolute

The guide states: "The --skip-schema-validation flag is safe to use and does not affect functionality."

Skipping schema validation means Helm will no longer catch any typos or invalid keys in values.yaml for that chart, not only the ones that triggered the error. A more accurate framing:

--skip-schema-validation bypasses chart schema checks entirely. Use it as a temporary workaround while tracking upstream schema fixes — typos in your values.yaml will no longer be caught automatically when this flag is set.


4. Duplicate cross-reference in deploy-to-kubernetes.mdx

The troubleshooting guide is now linked in two places in the same file: once in the "Further Reading" resources list and once in the "Production Considerations" checklist. Linking it once (in the resources list) is sufficient; the second mention adds noise without value.


5. GitHub issue link text in Warning block

The link text flipt-io/flipt#5426 is opaque to readers unfamiliar with GitHub shorthand. A more descriptive label would improve clarity, for example:

or wait for the configurable machine fingerprint feature to ship


6. GKE default storage class

The guide states: "On GKE, the default storage class is standard." GKE's default storage class varies by cluster configuration — newer clusters commonly default to standard-rwo (ReadWriteOnce, Balanced PD). The existing tip to run kubectl get storageclass is the right call; consider making that the primary advice and dropping the platform-specific assertion to avoid outdated guidance.


Minor suggestions (non-blocking)

  • The flipt-config.yaml example in "Check License Locally" only includes license.key. If license validation depends on other environment-specific config, this minimal file could produce a false negative. A short caveat would help.
  • Consider a brief introductory sentence on the "Affected Environments" list noting it is not exhaustive — any minimal container image on a containerd runtime can be affected.

Overall this fills a real gap in the docs and is well worth merging. Addressing items 1–3 above would meaningfully improve accuracy and prevent user confusion.

- Reword --skip-schema-validation note to clarify it bypasses all checks
- Use descriptive link text for GitHub issue reference
- Drop platform-specific storage class assertions in favor of kubectl
- Add release name comment to kubectl exec example
- Remove duplicate troubleshooting link from Production Considerations

Signed-off-by: Mark Phelps <209477+markphelps@users.noreply.github.com>
@claude
Copy link
Contributor

claude bot commented Feb 25, 2026

PR Review: Kubernetes Troubleshooting Guide

This is a well-structured guide that addresses real-world issues users encounter. The use of Mintlify components (<Note>, <Warning>, <Tip>), step-by-step secrets management walkthrough, and consistent link formatting are all solid. A few things worth addressing before merging:


Issues

1. Cloud Run listed under Kubernetes affected environments

Cloud Run is a serverless platform — users deploying there wouldn't be using the Flipt Helm chart described in this guide at all. Including it in the "Affected Environments" list under a Kubernetes troubleshooting guide is misleading and may send Cloud Run users down the wrong path.

2. Fargate doesn't support hostPath volumes at all

The current <Warning> says:

If your cluster's security policy restricts hostPath volumes, you may need to request an exception...

For AWS Fargate, this isn't a policy restriction — hostPath volumes are fundamentally unsupported by design, regardless of policy. Fargate users cannot use this workaround at all and need to wait for the configurable machine fingerprint feature. The warning should distinguish between "restricted by policy" (solvable) and "not supported by the runtime" (Fargate, unsolvable today).

3. hostPath volume missing type: File

The volume definition doesn't specify a type:

extraVolumes:
  - name: machine-id
    hostPath:
      path: /etc/machine-id

Without type: File, if /etc/machine-id doesn't exist on the host node (possible on some distros or hardened images), Kubernetes will create an empty directory at that path rather than failing clearly. This would cause a silent failure where the license check still can't read the machine ID. Adding type: File makes the intent explicit and fails fast if the file is missing:

extraVolumes:
  - name: machine-id
    hostPath:
      path: /etc/machine-id
      type: File

4. kubectl create secret --from-literal exposes secrets in shell history

The example in Step 1 puts the license key directly in the command:

kubectl create secret generic flipt-secrets \
  --from-literal=FLIPT_LICENSE_KEY=your-license-key

This approach stores the secret value in the user's shell history. Consider adding a note to use --from-file or --from-env-file as a more secure alternative, or note that users should clear their history or use a secrets manager CLI to pipe the value instead.

5. flipt-config.yaml with license key — no warning about committing it

The "Check License Locally" section shows creating a flipt-config.yaml with the license key in plaintext without any note that this file should not be committed to version control. A brief warning here would help.

6. Specific Bitbucket error in the generic schema validation example

The error block includes:

- flipt.config.storage.bitbucket.backend: Additional property local is not allowed

This is very specific to a Bitbucket storage configuration and may confuse users who don't use Bitbucket storage. If this is meant to illustrate common schema errors, a more representative example (or a note that this is one user's specific configuration) would be clearer.


Minor

  • installation.mdx appended sentence: The added sentence ("If you run into issues, check the Kubernetes Troubleshooting guide.") is appended to the same paragraph as the deployment guide link. Making it a new sentence on its own line, or adding it as a separate bullet after the paragraph, would read more cleanly and match the style of the surrounding content.

  • kubectl exec deployment name: The comment says "Replace flipt-v2 with your Helm release name" — but the actual Kubernetes Deployment name may differ from the Helm release name depending on the chart's naming templates. It might be worth clarifying whether this is the release name, or how to find the exact deployment name with kubectl get deploy.


Summary

The content is accurate and useful. The three most important things to fix before merging are the Fargate clarification (#2), the hostPath type (#3), and removing Cloud Run from the Kubernetes environments list (#1). The secrets-in-history concern (#4) is a best-practice note worth including but lower priority.

@claude
Copy link
Contributor

claude bot commented Feb 25, 2026

PR Review

Overall this is a well-structured guide that addresses real pain points for Kubernetes deployments. The writing is clear, the examples are practical, and linking to related upstream issues (flipt#5426, helm-charts#269, #270) is great for traceability. A few things worth addressing before merging:


Missing section promised in the PR description

The PR description lists five sections, but the file only has four. Section 5 ("flipt license check and Helm Values") — which was described as clarifying that flipt license check expects Flipt config (not Helm values) and showing how to check license status in a running pod — is absent from the new file. Either add the section or update the PR description to reflect what's actually there.


installation.mdx — awkward sentence append

The change tacks a second sentence onto the end of an existing paragraph sentence:

For comprehensive deployment instructions and configuration examples, see the [Deploy Flipt v2 to Kubernetes] guide. If you run into issues, check the [Kubernetes Troubleshooting] guide.

These read as two separate thoughts merged into one place. A cleaner option would be a new line or a <Note> component:

For comprehensive deployment instructions and configuration examples, see the [Deploy Flipt v2 to Kubernetes](/v2/guides/operations/deployment/deploy-to-kubernetes) guide.

For common issues and solutions, see the [Kubernetes Troubleshooting](/v2/guides/operations/deployment/kubernetes-troubleshooting) guide.

"Rootless container environments" in Affected Environments

This entry is vague and potentially misleading. The root cause of the machine-id issue is the container runtime (containerd vs Docker), not rootless execution specifically. A rootless Docker setup would still have the Docker fallback paths. Consider removing this or replacing it with something more precise, like "Kubernetes clusters without Docker-compatible cgroup paths."


Verify flipt.envFrom and flipt.extraEnvVars paths in Helm values

The secrets section uses:

flipt:
  envFrom:
    - secretRef:
        name: flipt-secrets

and

flipt:
  extraEnvVars:
    - name: FLIPT_LICENSE_KEY
      ...

Helm chart value paths are easy to get subtly wrong. Please confirm these match the actual keys in the flipt-v2 chart's values.yaml. If the chart uses envFrom or extraEnvVars at the top level (not nested under flipt:), the examples would silently have no effect.


Minor: --skip-schema-validation workaround could note the right fix path

The <Note> says to track upstream schema fixes in flipt-io/helm-charts, but doesn't link to a specific open issue. If there's a tracking issue (e.g., helm-charts#270 or similar), linking it directly would make it easier for users to follow up and know when they can drop the flag.


What's in good shape

  • Frontmatter is correct (title + description present)
  • All code blocks have language tags (text, bash, yaml)
  • Internal links use relative paths and markdown link syntax
  • <Note>, <Warning>, and <Tip> components are used appropriately
  • The hostPath volume mount example in the machine-id section is accurate and actionable
  • The ${env:VAR_NAME} substitution pattern is correctly documented and cross-referenced

@markphelps markphelps marked this pull request as ready for review February 25, 2026 20:32
@markphelps markphelps requested a review from a team as a code owner February 25, 2026 20:32
Co-authored-by: Roman Dmytrenko <rdmytrenko@gmail.com>
@markphelps markphelps requested a review from erka February 26, 2026 01:36
@claude
Copy link
Contributor

claude bot commented Feb 26, 2026

PR Review

Overall this is a solid, well-structured troubleshooting guide with clear code examples and good use of Mintlify callout components. Here are my observations.

Issues

1. Cloud Run missing from Affected Environments

The PR description lists Cloud Run as an affected environment (GKE, EKS, Fargate, Cloud Run), but it is absent from the Affected Environments list in the document. If it is a real affected platform, it should be added for completeness.

2. Redundant entries in Affected Environments

The list includes both 'Amazon EKS with containerd runtime' and 'Amazon EKS with AWS Fargate node groups'. Fargate is already covered by the general 'Any Kubernetes cluster using containerd instead of Docker' bullet. Consider collapsing these, or adding a brief clarifier that distinguishes why Fargate is called out separately.

3. Helm Charts issues not linked in the document

The PR description references flipt-io/helm-charts#270 and flipt-io/helm-charts#269 as related upstream fixes. Linking to these in the relevant Note/Warning callouts would help users track when the workarounds become unnecessary, similar to how flipt-io/flipt#5426 is already referenced in the Machine ID warning.

Minor Suggestions

4. No guidance for Helm older than 3.13

The guide says --skip-schema-validation requires Helm 3.13+, but does not say what to do on older versions. A brief note suggesting upgrading Helm or removing the unrecognized keys from values.yaml as an alternative would be helpful.

5. Guide ends without a closing section

The deploy-to-kubernetes.mdx guide has a 'Next Steps' section with related resources. This guide ends abruptly after the last YAML block. Even a small 'Related' or 'See Also' section pointing back to the main Kubernetes deploy guide and configuration reference would improve the reading experience.

6. installation.mdx sentence addition

The new sentence is appended inline to an existing paragraph. Since the two sentences have different scopes (happy path vs. fallback), a line break or new paragraph would improve scannability.


What is Working Well

  • Frontmatter is complete and correctly formatted
  • All code blocks have language tags (text, bash, yaml)
  • All links use proper markdown syntax with descriptive text
  • Internal links use relative paths
  • The Note, Warning, and Tip callouts are used appropriately and add real value
  • The anchor link /v2/configuration/overview#environment-substitution-and-secret-references resolves correctly against the existing heading in that page
  • Navigation entry in docs.json is correctly placed after the existing Kubernetes deploy page
  • The envFrom plus env:VAR_NAME pattern for GitOps secrets is well-explained and practically useful

@markphelps markphelps merged commit aecc8d9 into main Feb 26, 2026
5 checks passed
@markphelps markphelps deleted the docs/k8s-troubleshooting branch February 26, 2026 15:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants