How to avoid VERCEL_OIDC_TOKEN from "busting" our turbo tasks

Summary

Hello! On our CI we pull env vars from Vercel as we use Vercel to manage them. We use the vercel cli for that.
We rarely if ever get cache hits since the “VERCEL_OIDC_TOKEN” change from inside our .env file. Our turbo json is as so:

{
// ..
    "test": {
      "dependsOn": ["test:assets:pull"],
      "passThroughEnv": ["VERCEL_OIDC_TOKEN"]
    },
// ..
}

And still we see cache misses.

We can see why we get cache misses from the --dry-run=json output:

{
  "id": "33Pw2FCX7nN0Zl8yUeo1wu5LsMp",
  "version": "1",
  "turboVersion": "2.4.1",
  "monorepo": true,
[...]
    "rootKey": "I can’t see ya, but I know you’re here",
    "files": {
      ".env": "THIS_IS_STABLE",
      "apps/web/.env": "THIS_CHANGES_BETWEEN_RUNS"
    },
[...]
}

If we run --dry-run=json twice and compare the outputs, the apps/web/.env is always changing. If we compare the .env files between runs, the value that changes is always VERCEL_OIDC_TOKEN.

Any idea here?

in short we do:

cd apps/web
vercel link --scope team --yes
vercel env pull .env --yes
turbo run test # we get a cache miss. next time we run our CI, another cache miss..

Hey, @bitttttten, I can see how this can be confusing. I’ll shed some light.

Turborepo has to account for environment variables that come from the operating system and ones that come from files differently. I’m realizing our documentation needs to be more clear about this. I’ll update that.

In the meantime, the specific reasoning behind the behavior you’re seeing is:

  • passThroughEnv is for passing through environment variables that come from the operating system. Because vc env pull creates a file, passThroughEnv won’t be on the path between the .env file and Turborepo’s hashing. The passThroughEnv configuration you’re showing ends up not affecting this situation.
  • The apps/web/.env file is likely changing in between test runs, since VERCEL_OIDC_TOKEN is short-lived. When that file changes, the hash changes, resulting in a cache miss.

The question that leaves me with is: Why is that file included in your hashes?

  • Turborepo’s hashing ignores .gitignore’d files by default. Is .env ignored in your gitignore?
  • Is there anywhere that this file is being added to the hash? Something like .env or **/.env*in a turbo.json?
  • Another way that these can sneak by is if you use the inputs key on a task, but you’re showing a task that doesn’t have one, so I’ll rule that out.

To summarize:

  • passThroughEnv doesn’t affect this behavior.
  • The contents of the apps/web/.env file is indeed changing, since VERCEL_OIDC_TOKEN is changing.
  • It’s hard to tell why the .env file is included in your hashing without more detail.

If you can reason about those pieces with your repo, you’ll be able to get cache hits for that task.

3 Likes

The question that leaves me with is: Why is that file included in your hashes?

Right! Well we have this in our turbo config:

{
  "$schema": "https://turbo.build/schema.json",
  "globalDependencies": ["**/.env", ".env"],
  "tasks": {
    "build": {
      "dependsOn": ["^build"],
      "outputs": ["dist/**", ".next/**"],
      "env": [
        "TURBO_TEAM",
        "TURBO_PREFLIGHT",
        "TURBO_TOKEN",
        "TURBO_API",
        // ..
        "NODE_ENV",
        "VITE_*",
        "NEXT_PUBLIC_*",
        "VERCEL_*",
        "VERCEL",
        "OTEL_SERVICE_NAME",
        "OTEL_EXPORTER_OTLP_SPAN_ENDPOINT",
        "OTEL_EXPORTER_OTLP_SPAN_TOKEN",
        /// 
      ],
      "passThroughEnv": ["VERCEL_OIDC_TOKEN"]
    },
  }
}

So someone has added "globalDependencies": ["**/.env", ".env"], :thinking:

If I am understanding your message, you’re saying that this shouldn’t necessarily be in there?

In that case, Turborepo is exhibiting the expected behavior. :+1:

As you’re seeing globalDependencies is a big hammer. Any changes to the contents of any of those files will cause cache misses. What I frequently see as what folks want is for build to miss cache on changes to .env files, but everything else can hit cache. That would look like:

{
  "tasks": {
    "build": {
      "inputs": ["$TURBO_DEFAULT", "$TURBO_ROOT$/.env", ".env*"]
    }
  }
}

Docs for $TURBO_DEFAULT: Configuring turbo.json | Turborepo
Docs for $TURBO_ROOT: Configuring turbo.json | Turborepo

This would mean that .env* is meaning for your build task, but not meaningful for all of your other tasks. that’s usually what people are looking for. If that’s not what you’re looking for then, hopefully that can at least help you construct the behavior you’re looking for.

1 Like

Wow thanks so much! This is super helpful. Indeed I think the intention is only to do this for builds.

So for example, if someone sets “tasks.build.env”, this is just environment variables from the shell for example. Having the .env as an task.build.inputs, means it’s the whole contents. So what I mean is sometimes I see things inside the “env” array. This would also be unnecessary then right?

i.e.

{
"$schema": "``https://turbo.build/schema.json",
"ui": "tui",
"tasks": {
"build": {
"dependsOn": ["^build"],
"outputs": [".next/**"],
“inputs”: ["$TURBO_DEFAULT", "$TURBO_ROOT$/.env", ".env*"],
"env": [ “SOME_ENV_VAR” ],
}
}
}

So if SOME_ENV_VAR was only ever set inside the .env file, and never “exported”, then this would also be unnecessary right? Basically “env” is taking shell environment variables, it’s not diving into .env file (unless you mainly export all vars from inside the .env file before running build). Is that also correct to say?

Thanks for the links to the docs, really helps!

1 Like

Yes, env only applies to system environment variables coming directly from the OS.

2 Likes