DAG Status Information

This document serves as a living record of which DAGs are disabled or unstable and why. This can be helpful for tracking the various issues a DAG might have and knowing which DAGs we can turn back on when.

A “Disabled” DAG is turned off in production. An “Unstable” DAG is turned on (often in order to consume partial data), but raising expected/known errors.

Note: The DAG column links to Airflow directly, which is not currently publicly accessible. We are working on improving our Role-Based Access setup in order to allow community members to view Airflow without an account. We’ll make an announcement on the Make WP blog once this is ready!

DAGStatusReasonLast Updated
Image expirationDisabledNot yet investigated.
OauthDisabledOauth is not set up or needed for providers yet
Flickr reingestionDisabledDisabled while investigating to make sure we don’t hit APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. rate limits.2023-03-31
Phylopic (standard & reingestion)DisabledDAG requires upgrade to API v2.2023-02-13
Reported media pending reviewDisabledPending a clear set of actionable steps for the maintainer/content safety team when media is reported.2023-02-17
iNaturalistDisabledPaused until the iNaturalist schedule interval is changed .2023-03-28
Snapshot rotation DAGDisabledNot yet tested with a staging database.2023-03-30
NYPLDisabledThe v1 API of NYPL stopped working. Pending update to v2.2023-07-31
Popularity refreshes (audio, image)DisabledMust not be re-enabled until https://github.com/WordPress/openverse/pull/2883 is merged to update popularity constants within the metrics table, rather the constants view. We should try running an audio popularity refresh *before* attempting an image popularity refresh. The performance, particularly of the constants update, should be monitored.2023-08-28

Last updated: