DAG Status Information

This document serves as a living record of which DAGs are disabled or unstable and why. This can be helpful for tracking the various issues a DAG might have and knowing which DAGs we can turn back on when.

A “Disabled” DAG is turned off in production. An “Unstable” DAG is turned on (often in order to consume partial data), but raising expected/known errors.

Note: The DAG column links to Airflow directly, which is not currently publicly accessible. We are working on improving our Role-Based Access setup in order to allow community members to view Airflow without an account. We’ll make an announcement on the Make WP blog once this is ready!

DAGStatusReasonLast Updated
auckland_museum_workflowDisabledDisabled until data quality related to broken media links can be investigated, and parameters adjusted to ensure the DAG will correctly do its initial backfill. There is also an issue to look at duplicates with Wikimedia; we may be able to turn on the DAG while this is still being investigated (as long as we don’t enable the provider in the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.), but we should consider it carefully.2024-01-12
Image expirationDisabledNot yet investigated.
OauthDisabledOauth is not set up or needed for providers yet
Flickr reingestionDisabledDisabled while investigating to make sure we don’t hit API rate limits.2023-03-31
Reported media pending reviewDisabledPending a clear set of actionable steps for the maintainer/content safety team when media is reported.2023-02-17
iNaturalistDisabledPaused due to a broken hardcoded Catalog of Life url, awaiting response from iNaturalist. .2024-01-18
Popularity refreshes (audio, image)DisabledMust not be re-enabled until https://github.com/WordPress/openverse/pull/2883 is merged to update popularity constants within the metrics table, rather the constants view. We should try running an audio popularity refresh *before* attempting an image popularity refresh. The performance, particularly of the constants update, should be monitored.2023-08-28

Last updated: