A week in Openverse: 2024-01-22 – 2024-01-29

openverse

Merged PRs

  • #3706: Publish changelog for catalog-2024.01.25.17.42.59
  • #3705: Use format over toJSON for escaping quotes in GHA
  • #3702: Do not use unsupported folded chomping multiline YAML blockBlock Block is the abstract term used to describe units of markup that, composed together, form the content or layout of a webpage using the WordPress editor. The idea combines concepts of what in the past may have achieved with shortcodes, custom HTML, and embed discovery into a single consistent API and user experience. syntax
  • #3701: Update dependency apacheApache Apache is the most widely used web server software. Developed and maintained by Apache Software Foundation. Apache is an Open Source software available for free.-airflow to v2.8.1 [SECURITY]
  • #3699: Wrap PR and discussion titles in `toJSON` to escape them in SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/.
  • #3698: Enable the isort rules for ruff and fix linting issues
  • #3697: Use correct path when checking if DAGs.md needs to be regenerated
  • #3696: Update WordPress Photo Directory DAG to ingest weekly
  • #3692: Bump pillow from 10.1.0 to 10.2.0 in /utilities/project_planning
  • #3691: Update dependency Pillow to ~=10.2.0 [SECURITY]
  • #3690: Bump pillow from 10.0.1 to 10.2.0 in /utilities/provider_tallies
  • #3689: Revert "Toggle CloudWatch alarms actions during Data Refresh (#3652)"
  • #3687: Remove needs db and add patch
  • #3685: Bump jupyterlab from 4.0.8 to 4.0.11 in /utilities/project_planning
  • #3684: Bump notebook from 7.0.6 to 7.0.7 in /utilities/project_planning
  • #3680: Remove outdated Docker image deployment workflow and documentation
  • #3676: Combine write to file and STDOUT in one command using `tee`
  • #3669: Update Rawpixel image URLs for ingestion
  • #3596: Don't trigger PR limit reminder on PRs that are ignored
  • #3586: Make the terms of service apply to all services
  • #3568: Update copy changes implementation plan to avoid table rename
  • #3537: Add create_new_es_index DAGs

Closed issues

  • #3710: <Replace this with actual title>
  • #3695: Ruff failing to correctly sort imports
  • #3436: Skip `needs_db` and always map ES hits to DB results
  • #2492: Large SVGs can cause gunicorn workers to abort
  • #2372: Build ES index creation DAG
  • #2324: Implementation Plan – Nuxt 3 MigrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies.

openverse-infrastructure

Merged PRs

  • #765: Add ssh-ed25519 key for Olga
  • #764: Add ssh-ed25519 key for Madison
  • #763: Bump catalog version
  • #762: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.
  • #751: Update stacimc key to ed25519
  • #750: Add playbook to sync ssh keys to all ec2 instances
  • #743: Configure ansible to route through the bastion

Closed issues

  • #707: Route Ansible SSHSSH Secure SHell - a protocol for securely connecting to a remote system in addition to or in place of a password. connections through the jumphost and disable public IPs on all infrastructure

#openverse, #week-in-openverse

Community Meeting Recap (2024-01-23)

[Meeting start]

Agenda

📢 Reminder: the frontend is still in a code freeze, but as soon as we get the Nuxt 3 PR merged, we can unfreeze it! :ice_cube:

[Meeting end]

#openverse-weekly-community-meeting

A week in Openverse: 2024-01-15 – 2024-01-22

openverse

Merged PRs

  • #3681: Add missing spaces in PR pingPing The act of sending a very small amount of data to an end point. Ping is used in computer science to illicit a response from a target server to test it’s connection. Ping is also a term used by Slack users to @ someone or send them a direct message (DM). Users might say something along the lines of “Ping me when the meeting starts.”
  • #3679: Add user-agent headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes. for rawpixel
  • #3678: Publish changelog for ingestion_server-2024.01.18.18.31.08
  • #3677: Bump jupyter-lsp from 2.2.0 to 2.2.2 in /utilities/project_planning
  • #3674: Allow CI + CD workflow to deployDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. from a different branch
  • #3672: Update alarms runbooks
  • #3671: Publish changelog for api-2024.01.16.17.28.14
  • #3668: Add Rawpixel to authority data as CURATED
  • #3667: Update tests for Nuxt 3 migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies.
  • #3662: fix: broken pipenv install link in general quickstart guide
  • #3656: Simplify load sample data
  • #3655: Bump jinja2 from 3.1.2 to 3.1.3 in /utilities/project_planning
  • #3653: Fix formatting of PR ping SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. messages
  • #3652: Toggle CloudWatch alarms actions during Data Refresh
  • #3649: Draft PRs fix for PR Board automation
  • #3629: Add documentation for yearly planning
  • #3618: Do not ingest Jamendo records with downloads disabled
  • #3590: Additional service checks for ingestion server health endpoint
  • #3570: Add codeowners pre-commit check
  • #3025: Prevent iNaturalist from running alongside any other DAGs

Closed issues

  • #3647: Disable alarm notifications during ES index creation
  • #3630: Fix link to `pipenv` installation instructions on general setup guide
  • #3530: Do not ingest Jamendo tracks that do not allow downloads
  • #3467: PRs that both have changes requested and are drafted should be in the drafted column of the PR board
  • #2505: Frontend response count alarms
  • #2504: Frontend response time alarms
  • #2501: General APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. response time alarms
  • #2344: ECS Alarms for anomalous behavior
  • #2019: Add additional checks to ingestion server healthcheck endpoint
  • #1276: Prevent iNaturalist from running alongside any other DAGs

openverse-infrastructure

Merged PRs

  • #759: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.
  • #757: Bump ingestion server to rel-2024.01.18.18.31.08
  • #754: 🔄 synced file(s) with WordPress/openverse
  • #753: Stabilize API Production alarms
  • #752: Stabilize and update Nuxt alarms settings
  • #749: 🔄 synced file(s) with WordPress/openverse
  • #736: Set up DNSDNS DNS is an acronym for Domain Name System - how you assign a human readable address to a website’s exact numeric coded location (ie. wordpress.org uses the actual IP address 198.143.164.252). records for ingestion server in staging

Closed issues

  • #14: Distinguish between dev & prod ingestion server DNS records

#openverse, #week-in-openverse

A week in Openverse: 2024-01-08 – 2024-01-15

openverse

Merged PRs

  • #3657: Fix PR limit reminders
  • #3654: Bump jinja2 from 3.1.2 to 3.1.3 in /documentation
  • #3651: Fix slackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. pingPing The act of sending a very small amount of data to an end point. Ping is used in computer science to illicit a response from a target server to test it’s connection. Ping is also a term used by Slack users to @ someone or send them a direct message (DM). Users might say something along the lines of “Ping me when the meeting starts.” action version reference
  • #3648: Add additional_query_params to provider DAG configuration
  • #3645: Publish changelog for api-2024.01.08.20.06.53
  • #3644: Increase ES throttling rate in Ingestion Server
  • #3643: Publish changelog for frontend-2024.01.08.19.12.18
  • #3628: pgcli version is extracted into a docker argument
  • #3627: Remove unused AWS variables
  • #3623: Add secret-key check
  • #3622: Convert unittest to pytest
  • #3594: Documentation for becoming a committer, reorganize contributing pages
  • #3577: Replaced `cURL` slack workflows with `slackapi`
  • #3258: Retrieve Auckland Museum Image Data

Closed issues

  • #3533: Add optional `additional_query_params` config to provider DAGs
  • #3461: Replace cURL to Slack with `slackapi/slack-githubGitHub GitHub is a website that offers online implementation of git repositories that can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/-action`
  • #3456: Add a check on startup to ensure application is not using default key in live environments
  • #3425: Convert unittest.TestCase tests to pytest
  • #2959: Extract the pgcli version for the Dockerfile from the api Pipfile
  • #1771: Auckland Museum

openverse-infrastructure

Merged PRs

  • #745: Increase settings of Response Time alarms for the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. and Nuxt

Closed issues

  • #729: Delete thumbnails log group

#openverse, #week-in-openverse

X-post: Call for Mentees & Mentors: Contributor Mentorship Program Cohort #2 (2024 Q1)

X-comment from +make.wordpress.org/community: Comment on Call for Mentees & Mentors: Contributor Mentorship Program Cohort #2 (2024 Q1)

A week in Openverse: 2024-01-01 – 2024-01-08

openverse

Merged PRs

  • #3625: Publish changelog for catalog-2024.01.04.20.39.50
  • #3624: HOTFIX: Convert encoded HTTPHTTP HTTP is an acronym for Hyper Text Transfer Protocol. HTTP is the underlying protocol used by the World Wide Web and this protocol defines how messages are formatted and transmitted, and what actions Web servers and browsers should take in response to various commands. connection variables to http type not httpsHTTPS HTTPS is an acronym for Hyper Text Transfer Protocol Secure. HTTPS is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The 'S' at the end of HTTPS stands for 'Secure'. It means all communications between your browser and the website are encrypted. This is especially helpful for protecting sensitive data like banking information.
  • #3617: Silence the gunicorn access log
  • #3615: Update dependency @playwright/test to v1.40.1
  • #3614: Refactor Playwright tests
  • #3613: Delete unused VOldIconButton
  • #3612: Publish changelog for api-2024.01.01.19.52.14
  • #3611: Remove step to deployDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. production thumbnails
  • #3610: Publish changelog for frontend-2024.01.01.19.52.49
  • #3609: Update dependency eslint-pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party-playwright to ^0.21.0
  • #3608: Update dependency elasticsearch to v8.11.1
  • #3607: Update dependency babel-loader to v8.3.0
  • #3606: Update dependency async-mutex to ^0.4.0
  • #3604: Update workflows
  • #3603: Update dependency vue-tsc to v1.8.27
  • #3602: Update dependency @types/node to v18.19.4
  • #3591: Update Airflow filtered warnings, address deprecations
  • #3556: Update pnpm from 7.17.1 to 8.12.1
  • #3494: Implementation Plan: Django admin moderator access control and base improvements

Closed issues

  • #3382: Redis 7.x upgrade
  • #3349: Related endpoint returns 500 if the main item is not found in ES index
  • #3237: Duplication in APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. logging
  • #1966: Implementation Plan: Django admin access control and tool improvements
  • #1052: Upgrade to pnpm@8

openverse-infrastructure

Merged PRs

  • #742: Bump catalog version
  • #741: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.
  • #739: Update Redis `engine_version` and `parameer_group_name`
  • #731: Complete command for SSHSSH Secure SHell - a protocol for securely connecting to a remote system in addition to or in place of a password. key configuration

Closed issues

  • #611: Upgrade Redis to 7.x

#openverse, #week-in-openverse

A week in Openverse: 2023-12-25 – 2024-01-01

openverse

Merged PRs

  • #3600: Bump jwcrypto from 1.5.0 to 1.5.1 in /api
  • #3597: Docs: Fix typo
  • #3595: Update meeting times in README
  • #3593: Unify VueVue Vue (pronounced /vjuː/, like view) is a progressive framework for building user interfaces. https://vuejs.org/. components
  • #3589: Add "🗄️ aspect:data" label
  • #3588: Publish changelog for api-2023.12.26.05.11.22
  • #3575: Add a script to observe the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. during Redis migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies.
  • #3572: Update dependency apacheApache Apache is the most widely used web server software. Developed and maintained by Apache Software Foundation. Apache is an Open Source software available for free.-airflow to v2.8.0 [SECURITY]
  • #3532: Remove Internet Archive Book Images sub provider

Closed issues

  • #3574: Add an `aspect: data` label?
  • #2661: Airflow scheduler will crash when connection to the database drops, but container will not stop
  • #754: Collect data on API usage

openverse-infrastructure

Merged PRs

  • #738: 🔄 synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org.
  • #737: Increase datapoints_to_alarm for p99 response time threshold alarm

Closed issues

  • #404: Ignore `latest_restorable_time` changes in `module.staging-api.module.rds.aws_db_instance.this`

#openverse, #week-in-openverse

Looking forward: Openverse in 2024

The OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. maintainers have completed our planning for 2024! Below is the list of projects we plan on tackling, roughly ordered by priority. The full list can also be seen on our project tracker on GitHub. Each of the projects falls into one of the following themes:

  • Improve Search Relevancy
  • Refine Search Experience
  • Make Openverse Safer to Use & Maintain
  • Broaden our Data
  • Progress Service & Code Resiliency
  • Engage the Community

Projects for 2024

Want to get involved? Check out our GitHub repository for ways to engage!

#2024, #yearly-planning

Recap: Openverse in 2023

As 2023 comes to a close, I wanted to take some time to reflect on all that was accomplished in the last year! The below list isn’t exhaustive by any means, but it’s still lovely to look back at how OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. and the community have changed in the last cycle around the sun.

Projects

Switched the repositories to a monorepo

We moved from the repositories openverse-api, openverse-frontend, and openverse-catalog to just openverse.

Moved to openverse.org

This came along with several other improvements to the frontend:

Provider DAG stability

Early in the year, maintainers and community members undertook an effort to try and stabilize our existing provider DAGs and ensure they could all run consistently and without failures. This also included new ways for reporting and addressing failures internally.

Core user interface improvements

We set aside time to align the frontend components with the intended designs for each component.

API ECS Migration

We changed the way the Openverse APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. is deployedDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. in order to make it easier for maintainers to deployDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. new versions. What was previously 1-hour (or longer) deployment process involving two maintainers is now an automatic process that occurs at the click of a button!

Along with this change, we had a few other significant infrastructure improvements throughout the year:

Usage Analytics

In order to gain a better understanding of how folks are using Openverse, we implemented event analytics in the frontend. This included the addition of numerous user events, an effort which was assisted by several new contributors!

Popularity calculation optimizations

We modified how popularity for certain results is calculated behind-the-scenes, in an effort to simplify the process and reduce the time it takes for results that get added to the catalog to become available in searches. This trimmed hours off of a weekly data refresh process and reduced the number of errors that we saw with the refresh.

Filter and blur sensitive results by term matching

We posted about this new feature earlier this year:

API response time assessment and reduction

In the final third of 2023, Openverse experienced increase response times for searches across the board. The cause of this was ultimately the increase in both users and data available within our catalog – both great advancements! After several months of hard work, the maintainers were able to pin down a primary source for the slowness and significantly reduce how long it takes to run a search.

Community

In addition to the technical accomplishments from this last year, Openverse also had an impact on (and was impacted by!) the community.

New committers

Formalized the project planning process

The maintainers spent time in the first half of the year defining a process for how projects themselves would be planned, with the intent to make that planning transparent and available to the community. The motivation behind the decision-making process can be seen on our documentation site, as well as the specifics around how projects are planned. The documentation for each of the projects is also available on the site.

Collaboration with providers

Several of the Openverse maintainers met with one of our providers, the Statens Museum for Kunst.

Open Education Global – Infrastructure Award

Openverse was honored to receive the award for Open Infrastructure from OEG earlier this year.

By the numbers

Consider this the “Openverse Wrapped”! Here’s some numbers for the end of 2023:

See you in 2024!

Lighttrail Rocket” by SpaceX/ CC0 1.0

#2023, #yearly-planning #2023

#2023

Openverse Monthly Priorities Meeting 2024-01-03

OpenverseOpenverse Openverse is a search engine for openly-licensed media, including images and audio. Find Openverse on GitHub and at https://openverse.org. contributors will host a community meeting to discuss priorities for January at 1500 UTC on January 3rd, 2024.

A sync video chat link will be provided in the #openverse channel of the Making WordPress Chat. We hope to see you there!

You can read the ongoing notes document for these meetings here.

#openverse-priorities, #priorities