DAG Status Information

This document serves as a living record of which DAGs are disabled or unstable and why. This can be helpful for tracking the various issues a DAG might have and knowing which DAGs we can turn back on when.

A “Disabled” DAG is turned off in production. An “Unstable” DAG is turned on (often in order to consume partial data), but raising expected/known errors.

Note: The DAG column links to Airflow directly, which is not currently publicly accessible. We are working on improving our Role-Based Access setup in order to allow community members to view Airflow without an account. We’ll make an announcement on the Make WP blog once this is ready!

DAGStatusReasonLast Updated
audio_data_refresh & image_data_refreshDisabledTemporarily paused while working on updating the title column in APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. & Catalog DBs, see issue2024-05-10
auckland_museum_workflowDisabledDisabled until data quality related to broken media links can be investigated, and parameters adjusted to ensure the DAG will correctly do its initial backfill. There is also an issue to look at duplicates with Wikimedia; we may be able to turn on the DAG while this is still being investigated (as long as we don’t enable the provider in the API), but we should consider it carefully.2024-01-12
OauthDisabledOauth is not set up or needed for providers yet
Flickr reingestionDisabledDisabled while investigating to make sure we don’t hit API rate limits.2023-03-31
Reported media pending reviewDisabledPending a clear set of actionable steps for the maintainer/content safety team when media is reported.2023-02-17
Metropolitan museum reingestion workflowDisabledPending an mechanism for reducing the time reingestion takes: Investigate metropolitan reingestion workflow failures/time outs2024-03-12
Science museum workflowDisabledRequires an update to the API parsing code2024-04-11

Last updated: