Cloud Composer CI/CD example#390
Conversation
There was a problem hiding this comment.
@tims unfortunately we are duplicating a lot of cicd for composer work.
Apologies for not increasing visibility on this work outside of AMER.
I've been on this train for a while and have some further lessons learned
I've been working in https://source.cloud.google.com/datapipelines-ci/datapipelines-ci/+/master
and plan on open sourcing as it's own repo. It's very similar to this blog I wrote w/ a partner but less infrastructure heavy (not requiring spinnaker / gke cluster).
I think in cicd for composer we need to cover more things it's really important to handle:
- testing / deploying SQL queries used by BQ
- custom conn_ids
- testing / deploying custom plugins
- actually integration testing against composer not just an airflow image (we can do this by copying dags to data directory and running
list_dagshere. Note that composer is slightly divergent from OSS airflow.
Additionally I like to be prescriptive / opinionated about things like asserting dag_id == filename in the ci framework.
| - -c | ||
| - | | ||
| pwd | ||
| ls -l | ||
| # Init an empty airflow metadata db | ||
| airflow initdb | ||
| # Initialise variables to configure airflow DAGs | ||
| airflow variables -s input gs://test-input-bucket/ | ||
| airflow variables -s output gs://test-output-bucket/ | ||
| # Run a backfill of our DAG for a specific date, add more dags here if required. | ||
| airflow backfill echo_and_wait --start_date=2020-01-03 --end_date=2020-01-03 | ||
| ret=$? | ||
| # Cat the task logs so we can see when there was any error | ||
| find /workspace/airflow/logs/ -type f -name *.log | sort | xargs cat | ||
| # Fail if the backfill failed. | ||
| exit $ret |
There was a problem hiding this comment.
This should be taken out into it's own script.
This also notably will cause red-herring test failures because it won't handle:
- if the customer has any custom conn_ids
- if the customer is using crypto / fernet key
- if the customer is also deploying custom plugins.
There was a problem hiding this comment.
I don't have access to that repo, permission denied
There was a problem hiding this comment.
I thought it might be more readable in the same file. And since this only tests one DAG this way, it's more of an example than a best practice. This is just to demo how you might override variables before creating and executing a DAG run.
I definitely agree a more complete solution would be great. But it would be good to have something simple like this to link to until then. We can change to link to that when it's ready.
There was a problem hiding this comment.
I'd suggest we merge this for now, and then come up with the better solution next. I'd suggest we're much better off having velocity and revising than waiting for perfect.
There was a problem hiding this comment.
Also, I consider the main benefit to this example to be just the syntax checking, it covers the most immediate need for customers. People tend to do very different things for end to end tests, and most don't, they just see what happens in prod. So it's more of a starting point and helps explain how simple doing things with cloud build is.
There was a problem hiding this comment.
i opened a PR a while back for the cicd for data pipelines solution that didn't get much traction is very similar to what I'm working on.
You could link to my fork: https://github.com/jaketf/ci-cd-for-data-processing-workflow
I think that handling plugins / connections is a must even for a simple example.
IMHO cloud build should be like a main function: very small high level readable steps, all implementation details like initdb, variables, etc should be burried in scripts.
| # Deploy the DAGs to your composer environment DAGs GCS folder | ||
| id: Deploy DAGs | ||
| args: | ||
| - -m |
There was a problem hiding this comment.
nit: I have had really flaky issues with deleting DAGs w/ composer when paused on creation is not manually configured to True. I also consider this a best practice in case a dag is manually copied here by accident and shouldn't actually be scheduled.
If this config is set this script won't handle the dag unpausing.
Note, (though not necessary) I've seen a desire to have a config that only deploys certain dags. Sometimes, users may need to run multiple deploys to sequence their updating of DAGs due to inter-dag dependencies or they have some utility temporary dags they want to deploy run and undeploy w/o removing the dag from their codebase.
I've solved this with a running_dags.txt
There was a problem hiding this comment.
That should be an environment wide setting no?
There was a problem hiding this comment.
My preference would be to set the environment wide setting to set all new dags as paused on creation.
If a dag is updated, I don't think it should ever automatically unpause. I'd much prefer the user has to do that once manually.
But this is very user and environment specific. I don't think CI/CD should be particularly opinionated about it.
|
|
@tims please try to access again. |
|
@tims FYI this is where I'm hoping to open source this as a refactor of the public solution. |
|
@jaketf Unfortunately I think there are a few issues with the run_tests script you linked to, it's more generic but not generic enough. It takes connections from an existing composer environment. But testing should probably not use the same connections as a composer env. I think they should be provided by the build, if any. It also assumes HTTP connections, which is not ideal. It also assumes the variables are in a specifically named local file, presumably committed, if we're going to make it more generic than inlining in the cloudbuild.yaml, these should be configurable. It expects a plugins dir, which a simple example does not require. Similarly it is expects bigquery/sql dir. I still think we should merge as is, and link to yours as the advanced solution when it's ready. |
|
@jaketf we can definitely open up a new PR on this when your is ready to converge. This one should be kept as simple as possible as an illustration for people are rolling their own CI/CD and (some customers are even only using airflow, not composer). And we can put the more complete solution in your repo. |
|
@tims makes sense glad you go this merged. thank you that's really valuable feedback on the run tests script. I think there's huge value in making it a bit more configurable / modular. I've migrated this to a PR in the open it will be a while til we can rewrite the solutions page but I'd appreciate any review comments like this! https://github.com/Google loud platform/ci-cd-for-data-processing-workflow/pull/2 |
Example that uses Cloud build to test airflow DAGs and deploy them to Cloud Composer.
When deploying cloud composer environments, people often ask, "How do I test my DAGs".
The answer is CI/CD
Then deploy.
I've prepared an example of this using cloud build.
PSO Eng Bug Id = 147466944