Mitigating out of terms API usage

Yesterday at 20:20 UTC, we released version 2.5.5 of our API! Along with a few dependency upgrades and DevEx improvements/fixes, this release also brings an important change regarding anonymous API requests. After v2.5.5, any media searches that are made without an APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. key cannot request more than 20 results per page.

This change was made in order to mitigate behavior we were seeing on the API which was adversely affecting performance for other users, our capacity to update the data that backs OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project., and our ability to deployDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. new changes to the API.

Our API Terms of Service state:

– A user must adhere to all rate limits, registration requirements, and comply with all requirements in the Openverse API documentation;

– A user must not scrape the content in the Openverse Catalog;

– A user must not use multiple machines to circumvent rate limits or otherwise take measures to bypass our technical or security measures;

– A user must not operate in a way that negatively affects other users of the API or impedes the WordPress Foundation’s ability to provide its services;

Background

Beginning around May 18th, we saw a significant increase in traffic.

Total requests made to api.openverse.engineering over the last 30 days

While the digital demographics (browser, user agent, OS, device type, etc.) were quite varied, one feature stuck out – these requests were all being made with the page_size=500 parameter.

Total requests made to api.openverse.engineering over the last 30 days using the page_size=500 parameter

Over the course of the last 30 days, these requests constituted almost 80% of our total traffic! While our application is designed to handle this many requests, it is not designed to handle each request querying for 500 results per page (the default page size is 20). As such, this had created significant strain on our Elasticsearch cluster and eventually caused disruptions in the API’s ability to serve results. The image below combines a few of our monitoring tools to show a general correlation between the page_size=500 requests and our Elasticsearch resource utilization.

Request count compared to Elasticsearch resource utilization

Even before this release, our application was set up to throttle individual, anonymous users to 1 request/second. These page_size=500 requests were coming from a myriad of different hosts; the initiator was able to circumvent the individual throttles by employing a large number of machines (also known as a botnet). These machines were also predominantly tied to a single data center and a single ASN, which led us to believe this was orchestrated by a single user.

This behavior was clearly in violation of our Terms of Service, since it was:

  1. Not using a registered API key for high-volume use
  2. Scraping data from Openverse
  3. Using multiple machines to circumvent the application throttles
  4. Consuming significant enough resources that it impacted other users of Openverse

Mitigation

As mentioned above, we deployedDeploy Launching code from a local development environment to the production web server, so that it's available to visitors. a change which would now return a 401 Unauthorized for any anonymous requests to the API that included a page_size greater than the default of 20. Almost immediately after deployment, we saw this mitigation take effect when observing request behavior:

Screenshot of a Cloudflare analytics page. The graph in the center shows total requests with page_size=500, separated by status code over 6 hours. A consistent number of requests (split between 301 and 200) can be seen starting at 9:00 PST. At 13:00 PST, the number of 401 requests begins to overtake the number of 200 requests. After 13:15, the number of 200 requests drops to zero and all requests returned are 401s.
Total number of page_size=500 requests made over the course of 6 hours, separated by return status code

In the above graph, you can see where we deployed v2.5.5 (~13:00 PST) – the number of 200 OK responses decreased, and the number of 401 Unauthorized responses increased significantly! Eventually all of the page_size=500 requests were being rejected as unauthorized.

With this change, we were able to successfully mitigate the botnet and return our resource consumption to typical levels. This can be seen easily with a few Elasticsearch metrics:

Elasticsearch metrics over the last 12 hours

While the intention behind Openverse is to make openly licensed media easy to access, we don’t currently have the capacity to enable users to access the entire dataset at once. We do plan on exploring options for this in the future.

We’re pleased that this mitigation was successful, and we will continue to be vigilant in ensuring uninterrupted access to Openverse for our users!

#openverse, #infrastructure, #api

A week in Openverse: 2022-03-21 – 2022-03-28

openverse

Merged PRs

  • #202: Update feature_request.md label template to remove priority and aspect
  • #198: Update bug_report.md to remove default priority label
  • #197: Update bug_report.md to remove `Expectation` section
  • #194: Add infrastructure repo to synced repo list

Closed issues

  • #157: Create OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project. GitHubGitHub GitHub is a website that offers online implementation of git repositories that can can easily be shared, copied and modified by other developers. Public repositories are free to host, private repositories require a paid subscription. GitHub introduced the concept of the ‘pull request’ where code changes done in branches by contributors can be reviewed and discussed before being merged be the repository owner. https://github.com/ activity overview dashboard
  • #140: Remove “Expectation” headerHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes. from bug report template
  • #73: [Feature] Configure ESLint and Prettier for JS scripts
  • #31: [MetaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress.] 3D Models
  • #30: Remove requests for reviews from closed PRs

openverse-catalog

Merged PRs

  • #441: 🔄 Synced file(s) with WordPress/openverse
  • #440: 🔄 Synced file(s) with WordPress/openverse
  • #424: Add LRU cache to `is_valid_license_info`
  • #423: Change PhyloPic date range & schedule interval
  • #422: Round duration for provider ingestion completion message
  • #421: Enable XCom pickling in Airflow
  • #397: Add data refresh to Airflow

Closed issues

  • #419: Add an `lru_cache` to `is_valid_license_info`
  • #410: Change Phylopic to @weekly
  • #377: Enable XCom pickling
  • #373: Format “Airflow DAG Load Data Complete” duration
  • #353: Data refresh orchestration DAG

openverse-api

Merged PRs

  • #591: 🔄 Synced file(s) with WordPress/openverse
  • #590: 🔄 Synced file(s) with WordPress/openverse
  • #586: 🔄 Synced file(s) with WordPress/openverse
  • #584: Replace plural `categories` as field name with singular `categoryCategory The 'category' taxonomy lets you group posts / content together that share a common bond. Categories are pre-defined and broad ranging.`
  • #583: Replace plural `categories` as field name with singular `category`
  • #580: Add CI check for uncommitted migrations
  • #577: Remove `query_serializer` for reporting endpoints
  • #576: Use `httpsHTTPS HTTPS is an acronym for Hyper Text Transfer Protocol Secure. HTTPS is the secure version of HTTP, the protocol over which data is sent between your browser and the website that you are connected to. The 'S' at the end of HTTPS stands for 'Secure'. It means all communications between your browser and the website are encrypted. This is especially helpful for protecting sensitive data like banking information.` for hyperlinked APIs by replacing the URLs

Closed issues

  • #573: Return secure URLs for the fields thumbnail, detail_url and related_url.
  • #571: Run `makemigrations` in CI to prevent merging PRs with missing migrations.

openverse-frontend

Merged PRs

  • #1187: 🔄 Synced file(s) with WordPress/openverse
  • #1186: Mock services using jest.mock
  • #1183: 🔄 Synced file(s) with WordPress/openverse
  • #1182: Fix missing nuxt types
  • #1178: Add useFetchState composable
  • #1175: Add the 3D model SVG
  • #1173: Remove redundant type and simplify media service
  • #1172: Content page component design fixes
  • #1168: Update audio categories
  • #1166: Remove source links from sources page
  • #1163: Add support for TypeScript in VueVue Vue (pronounced /vjuː/, like view) is a progressive framework for building user interfaces. https://vuejs.org/. SFCs.
  • #1153: Add local visual regression infrastructure
  • #1150: Typescriptify `api-service`
  • #1148: Hotfix for negative values in peaks
  • #1147: Strictly filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. sentry errors
  • #1144: Enable HTTPS in local development
  • #1142: fix focus outline placement button
  • #1140: Fix hero search button layout error
  • #1139: Add import extension linting rule
  • #1134: Convert license utils and constants to TS
  • #1131: Use links instead of buttons for header search type switcher
  • #1040: Convert the search store to Pinia

Closed issues

  • #1169: Audio category filter not working correctly
  • #1149: Add types to `data/api-service`
  • #1145: Add sentry ignore filters
  • #1143: Enable https in local development
  • #1138: Enable `import/extensions` rule for ESLint
  • #1136: Layout error in the hero search button in some locales
  • #1130: Search type switcher items in the header should use a link instead of a button
  • #1128: Europeana and SoundCloud don’t support search filters
  • #1122: Add 3D model icon svg to the project
  • #1121: Reduce to a single source of truth for search filters
  • #1110: Fix play/pause button focus outline placement
  • #1090: Create `VContentPage` component
  • #1037: Convert `search` store from Vuex to Pinia
  • #1019: Configure CI to run visual regression tests
  • #1017: Configure local visual regression testing
  • #1008: Providers links from Source page not working properly
  • #931: Include `utils/license.js` in `tsconfig.jsonJSON JSON, or JavaScript Object Notation, is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML.`

#openverse, #week-in-openverse

A week in Openverse: 2022-03-14 – 2022-03-21

openverse

Merged PRs

  • #171: RFC: 3D Model Support

openverse-catalog

Merged PRs

  • #418: Fix invalid license urls from Finnish Museum APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways.
  • #417: Use published Docker image in primary docker-compose.yml
  • #416: Fix schedule intervals on Cleveland Museum & Wikimedia Commons
  • #415: Reduce noise in NYPL ingestion
  • #414: Update API requests for Museum Victoria DAG
  • #413: Add ConnectionError to acceptable flaky exceptions for Freesound
  • #412: Add OFEO-SG subprovider
  • #409: Group test runs by module or class
  • #404: 🔄 Synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project.
  • #402: Make ‘sound’ categoryCategory The 'category' taxonomy lets you group posts / content together that share a common bond. Categories are pre-defined and broad ranging. more specific
  • #395: Handle duplicate keys in load_data task

Closed issues

  • #408: Group tests by test class in pytest to prevent test collisions
  • #406: Smithsonian workflow is missing configuration for sub-providers
  • #401: NYPL provider script is noisy regarding missing primary creators
  • #392: Finnish Museum `pull_data` freezes and times out
  • #391: PhyloPic DAG detects no content even when data exists
  • #390: Museum Victoria DAG fails to pull data
  • #389: Freesound pull_data task fails when getting audio file size
  • #388: Handle duplicate keys in the TSV load_data task
  • #379: Change Wikimedia Commons schedule interval to @daily
  • #378: Use published Docker image in primary docker-compose.yml
  • #368: Rename the “ingestion server” to “data refresh”

openverse-api

Merged PRs

  • #570: Add missing migrations
  • #568: Add throttle exemptions
  • #566: 🔄 Synced file(s) with WordPress/openverse
  • #556: Add pronunciation as valid sound category
  • #554: Add parameter to exclude certain sources

Closed issues

  • #565: Create an unrestricted rate limit model
  • #553: Query param to exclude a source
  • #526: Sound category mismatch
  • #391: Monitoring all the things

openverse-frontend

Merged PRs

  • #1137: Remove lodash.findindex from dependencies
  • #1129: Fix audio track null duration and add defaultRef
  • #1120: Update tailwindcss-rtl, talkback and typescript
  • #1115: 🔄 Synced file(s) with WordPress/openverse
  • #1112: Tweaks to the Image Details page
  • #1098: Fix mature content report submission
  • #1072: Refactor media store results getters
  • #1058: Convert more utils to TypeScript
  • #1057: Run e2e tests inside a docker container

Closed issues

  • #1111: Wrong font size on image details page and has horizontal scrolling on mobile
  • #1106: Replace `lodash.isempty` with domain-specific implementation
  • #1105: Replace `lodash.findindex` with `Array.prototype.findIndex`
  • #1079: Mature content report submission is broken
  • #1076: Audio track current time sometimes being set to non-real number
  • #1056: Faulty logic for audio count on the all results view
  • #1030: Audit tree-shaking and dead-code removal when using environment flags from `node_env.ts`
  • #929: Add types to `utils/get-parameter-by-name.js`
  • #920: Add types to `utils/attribution-html.js`
  • #895: Homepage search button text doesn’t fit in some locales
  • #756: Switch to Pinia

openverse-browser-extension

Merged PRs

  • #32: 🔄 Synced file(s) with WordPress/openverse

#openverse, #week-in-openverse

Community Meeting Recap (Mar 15th)

Takeaways

Done

  • Changes to the sound categoryCategory The 'category' taxonomy lets you group posts / content together that share a common bond. Categories are pre-defined and broad ranging. [ref]
  • A11yAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) fix for a double ring on focus around icon buttons by a new contributor 🎉 [ref]
  • Pinia migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies. PRs [ref]
  • TypeScript migration PRs [ref, ref]
  • Talkback proxy changes [ref]
  • Removal of the usage analytics code [ref]
  • Caching the audio waveforms in the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. [ref]
  • Update of Django to version 4 in the API [ref]

In Progress

  • End-to-end test dockerization PR needs a second review [ref]
  • Add parameter to exclude certain sources in the API [ref]
  • Fix mature content report submission [ref]
  • Tweaks to the Image Details page [ref]
  • Convert more utils to TypeScript [ref]

Upcoming

  • Unrestricted rate limit model [ref]
  • Add 3D models as an “additional source” (metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. search only view) in the content switcher [ref]
  • Addition of model_3d meta sources [ref]
  • Addition of video as a meta source [ref]

#openverse, #openverse-weekly-community-meeting

A week in Openverse: 2022-03-07 – 2022-03-14

openverse

Merged PRs

  • #191: Label PRs by contributors
  • #189: Update Pipenv files for Python 3.10
  • #188: Update Python workflows to run on Python 3.10
  • #185: Prevent PR labeller from overwriting labels on labelled PRs
  • #184: Lint RFCs and ensure they are lint-checked in the future
  • #180: Create a workflow to label a PR based on its linked issues

Closed issues

  • #183: Update Python workflows to run on Python 3.10
  • #175: PRs created with labels via GH CLICLI Command Line Interface. Terminal (Bash) in Mac, Command Prompt in Windows, or WP-CLI for WordPress. fail label check
  • #75: [Feature] Automatically label PRs with the aspect and goal labels

openverse-catalog

Merged PRs

  • #403: 🔄 Synced file(s) with WordPress/openverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project.

Closed issues

  • #384: Update APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. key for NYPL DAG
  • #383: Update API key for Smithsonian DAG

openverse-api

Merged PRs

  • #563: Send `[]` if media has no tags
  • #555: 🔄 Synced file(s) with WordPress/openverse
  • #548: Bump django from 3.2.12 to 4.0.3 in /api
  • #536: Bump pytest from 6.2.5 to 7.0.1 in /analytics
  • #530: Django command for generating waveforms

Closed issues

  • #549: Upgrade to Django 4
  • #529: Audio waveform cache-warming Django command
  • #143: [Bug] Integration tests of Ingestion server are failing

openverse-frontend

Merged PRs

  • #1113: Removed unused css class transition-colors
  • #1109: Remove unused deps
  • #1100: Handle waveform with `peaks` prop as a blank array
  • #1099: Add component imports, remove extra blank lines between imports
  • #1097: Remove Vocabulary icon font
  • #1095: Move typography defaults into `tailwind.css` file
  • #1092: Switch to css grid instead of legacy column classes in media reuse
  • #1084: Remove analytics code
  • #1082: Update a11yAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party and remove outdated package
  • #1080: 🔄 Synced file(s) with WordPress/openverse
  • #1070: Fix prop name mismatch
  • #1069: removing all the ‘ focus: ‘ classes from VIconButton.vue
  • #1068: Add Centre for Ageing to image metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. search
  • #1065: Give clearer feedback for how to fix outdated POT file in CI
  • #1063: Removed iframeiframe iFrame is an acronym for an inline frame. An iFrame is used inside a webpage to load another HTML document and render it. This HTML document may also contain JavaScript and/or CSS which is loaded at the time when iframe tag is parsed by the user’s browser.-height.js and its implementations
  • #1061: Convert Pinia stores to TypeScipt
  • #1045: Use a non-versioned API URLURL A specific web address of a website or web page on the Internet, such as a website’s URL www.wordpress.org to mock analytics requests, too
  • #1039: Extract filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. store from the search store and convert it to Pinia

Closed issues

  • #1094: Add font defaults to tailwind css file
  • #1091: Media reuse uses legacy columns classes
  • #1074: Empty waveform peaks data renders an empty waveform; should render placeholder instead.
  • #1067: Meta search provider: Centre for Ageing Better
  • #1059: Remove dead iframe height code
  • #1054: Replace individual `lodash.*` packages with `lodash` or remove entirely
  • #1026: Convert `usage-data` store from Vuex to Pinia
  • #1025: Convert `user` store from Vuex to Pinia
  • #1009: Empty audio search page shows audio track skeletons indefinitely
  • #1005: Audio play buttons have double focus rings
  • #1003: TypeError: Cannot read properties of undefined (reading ‘name’)
  • #919: Search from error page only show images
  • #915: Mock e2e testing analytics network requests
  • #842: Only update POT file timestamp if translations have changed
  • #834: Fixed footer when loading more images
  • #799: Image results sometimes `undefined`

openverse-browser-extension

Merged PRs

  • #31: Update README.md for consistiency with other repos
  • #29: 🔄 Synced file(s) with WordPress/openverse

Closed issues

  • #30: Update README to match other repositories

#openverse, #week-in-openverse

A week in Openverse: 2022-02-28 – 2022-03-07

openverse

Merged PRs

  • #174: RFC: MigrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies. from Vuex to Pinia in the front-end
  • #165: RFC: Visual regression testing

openverse-catalog

Merged PRs

Closed issues

  • #381: Report the environment in TSV Slack messages

openverse-api

Merged PRs

  • #547: Bump boto3 from 1.21.0 to 1.21.10 in /ingestion_server
  • #544: Bump sentry-sdk from 1.5.5 to 1.5.6 in /api
  • #543: Bump furo from 2022.2.14.1 to 2022.2.23 in /api
  • #542: Bump locust from 2.8.2 to 2.8.3 in /api
  • #541: Bump ipython from 8.0.1 to 8.1.0 in /api
  • #539: Bump spectree from 0.7.3 to 0.7.6 in /analytics
  • #538: Bump alembic from 1.7.5 to 1.7.6 in /analytics
  • #537: Bump filelock from 3.5.1 to 3.6.0 in /ingestion_server
  • #535: Bump ipython from 8.0.1 to 8.1.0 in /ingestion_server
  • #534: Bump python-decouple from 3.5 to 3.6 in /analytics
  • #533: Bump tldextract from 3.1.2 to 3.2.0 in /ingestion_server
  • #524: Send peak data in search results and details

openverse-frontend

Merged PRs

  • #1052: Convert more utils to TypeScript
  • #1049: Add group class to audio track
  • #1046: handling null and undefined value for userAgent
  • #1044: Minor improvements to `.eslintrc.js` and `package.jsonJSON JSON, or JavaScript Object Notation, is a minimal, readable format for structuring data. It is used primarily to transmit data between a server and web application, as an alternative to XML.`
  • #1041: Add the missing tape that causes e2e errors
  • #1038: Extract filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. definition to constants directory
  • #1036: changed the scroll to top button color B->P
  • #1035: Add width and height properties to images
  • #1023: Convert 6 utils to TypeScript
  • #1013: Add missing labels to VPopover
  • #1011: Use props instead of store for searchTerm (query.q)
  • #1006: Image cell focus state improvements
  • #999: Add eslint rules for imports and eslint comments
  • #906: Create a proof-of-concept for Pinia migration
  • #881: Use talkback proxy to mock e2e APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. requests
  • #738: Truncate text in content switcher button, allow width to change
  • #603: Create the updated ‘No results’ and ‘Server timeout’ sections

Closed issues

  • #1048: Boxed audio doesn’t show license icons on focus
  • #1012: Some VPopovers are missing labels
  • #1010: Single type search pages (`search/`) should not use the store
  • #1004: Image results lack a focus state
  • #939: Add types to `utils/srand.js`
  • #938: Add types to `utils/sentry-config.js`
  • #937: Add types to `utils/send-message.js`
  • #936: Add types to `utils/resampling.js`
  • #935: Add types to `utils/prng.js`
  • #933: Add types to `utils/math.js`
  • #930: Add types to `utils/case.js`
  • #927: Add types to `utils/format-strings.js`
  • #926: Add types to `utils/env.js`
  • #925: Add types to `utils/string-to-boolean.js`
  • #924: Add types to `utils/dev.js`
  • #922: Add types to `utils/decode-data.js`
  • #903: Non-string UserAgent can crash the app
  • #901: Add eslint-pluginPlugin A plugin is a piece of software containing a group of functions that can be added to a WordPress website. They can extend functionality or add new features to your WordPress websites. WordPress plugins are written in the PHP programming language and integrate seamlessly with WordPress. These can be free in the WordPress.org Plugin Directory https://wordpress.org/plugins/ or can be cost-based plugin from a third-party-eslint-comments to prevent accidentally leaving disabled rules for entire files
  • #857: Update attribution HTMLHTML HTML is an acronym for Hyper Text Markup Language. It is a markup language that is used in the development of web pages and websites. generation to point to OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project. license/mark glyphs
  • #855: Add eslint-plugin-import to enforce import order and extension consistency
  • #831: Set up Pinia
  • #830: Rename getters that have the same name as state properties in Vuex stores
  • #602: No results page
  • #499: SSR request mocking on E2E tests

#openverse, #week-in-openverse

Community Meeting Recap (Mar 1st)

Announcements

  • Next week’s meeting will be hosted by @zackkrida, as we continue our hosting rotation amongst the sponsored OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project. developers.

Takeaways

Done

  • The provider DAGs have been reactivated and audited [ref]
  • Add peaks to the AudioDetail interface, new contributor 🎉 [ref]
  • We added logging levels to the SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. utility in the APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. [ref]
  • Reconfigured retries and timeouts for DAGs [ref]
  • Support for locale based locale paths in WordPress themes, a big step for frontend i18n [ref]
  • Completed the v3.1.0 frontend milestone with post-launch bug fixes [ref]

In progress

  • Sending peak data in the API needs final review [ref]
  • Adding eslint rules, needs review [ref]
  • Adding linting for JSDoc [ref]
  • Discussed SEO issues related to the iframeiframe iFrame is an acronym for an inline frame. An iFrame is used inside a webpage to load another HTML document and render it. This HTML document may also contain JavaScript and/or CSS which is loaded at the time when iframe tag is parsed by the user’s browser.-approach. Issues that are fixable within the iframe have been resolved. We are optimistically waiting on feedback about moving away from the iframe before investing time in the remaining work. [ref]
  • Help needed in debugging Postgres connection crashes in the production API [ref]
  • Feedback needed on the 3D Model RFC [ref]

Upcoming

  • Tracking issue for issues coming out of the provider DAG audit [ref]
  • Milestone created for TypeScript RFC [ref]
  • Milestone created for Visual Regression Testing [ref]
  • Milestone created for Pinia migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies. [ref]
  • Milestone created for UIUI UI is an acronym for User Interface - the layout of the page the user interacts with. Think ‘how are they doing that’ and less about what they are doing. state cookie [ref]
  • Dependabot PRs to be tackled this week [ref]
  • RFC for feature flags in the frontend [ref]

#openverse-weekly-community-meeting

#openverse, #openverse-weekly-community-meeting

A week in Openverse: 2022-02-21 – 2022-02-28

openverse

Merged PRs

  • #173: Add new tech labels for Bash and TypeScript
  • #172: Add minimum wait period to RFCs
  • #164: RFC: Introduce UIUI UI is an acronym for User Interface - the layout of the page the user interacts with. Think ‘how are they doing that’ and less about what they are doing. state cookie
  • #163: Add RFC README.md

Closed issues

  • #144: Support locale based locale paths in WordPress theme

openverse-catalog

Merged PRs

Closed issues

  • #374: Format duration in TSV load complete Slack message
  • #370: [RFC] Catalog & APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. 3D Model Support
  • #356: TSV loader completion slack message
  • #352: Reactivate provider DAGs
  • #349: Improve provider workflow retries
  • #348: Use `execution_timeout` rather than `dagrun_timeout`
  • #283: Audit Provider scripts and associated DAGs

openverse-api

Merged PRs

  • #532: Add logging levels to Slack notifications in ingestion server
  • #507: Run CI/CD on every pull request
  • #500: Bump django-oauth-toolkit from 1.5.0 to 1.7.0 in /api

Closed issues

  • #481: Add “limited reporting” mode for ingestion server
  • #443: Run integration tests on all PRs
  • #416: `test_dead_links_are_correctly_filtered` Flakiness
  • #389: `just` scripting for the Analytics server

openverse-frontend

Merged PRs

  • #998: Add optional peaks key to AudioDetail interface
  • #997: Make active media setup store and add unit tests
  • #995: Remove unguarded localStorage access
  • #993: Fix CSSCSS CSS is an acronym for cascading style sheets. This is what controls the design or look and feel of a site. import ordering
  • #990: Fix attribution HTMLHTML HTML is an acronym for Hyper Text Markup Language. It is a markup language that is used in the development of web pages and websites. glyph reference and fix historical usages as well
  • #988: Remove z-index from brand blocking search type switcher
  • #982: Fetch single image result in `asyncData` hook instead of `fetch`
  • #981: Split CI into discrete jobs
  • #979: Add support for native TypeScript
  • #944: Remove dead code and fix errant type
  • #918: Enable SSR for migrationMigration Moving the code, database and media files for a website site from one server to another. Most typically done when changing hosting companies. banner
  • #917: Lint TS files in GitGit Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency. Git is easy to learn and has a tiny footprint with lightning fast performance. Most modern plugin and theme development is being done with this version control system. https://git-scm.com/. hooksHooks In WordPress theme and development, hooks are functions that can be applied to an action or a Filter in WordPress. Actions are functions performed when a certain event occurs in WordPress. Filters allow you to modify certain functions. Arguments used to hook both filters and actions look the same.
  • #916: Disallow all link components except “ with eslint rule
  • #904: Add migration notice and translation banner to the blank layout; fix translation banner logic
  • #898: Avoid translating brand names
  • #893: Rename Skeleton components
  • #892: Rename AudioDetails
  • #884: Use v-show instead of v-if for width-based condition
  • #880: Fix browser back button handling in search pages
  • #879: Make VLink component that wraps around both external and internal links
  • #867: Refactor media services
  • #851: Remove `mediaType` from `search.query` state
  • #850: Update license explanation tooltip

Closed issues

  • #996: Add peaks to the AudioDetail interface
  • #994: Rendering crashes in Chrome if localStorage is blocked
  • #992: Focused image result license icons are wrong colors
  • #983: Open external links in parent frame
  • #980: Navigate to the image detail page with an invalid id breaks it
  • #921: Add native TS support for non-VueVue Vue (pronounced /vjuː/, like view) is a progressive framework for building user interfaces. https://vuejs.org/. SFC files
  • #909: [RFC] Introduce UI state cookie to fix pop-in issues
  • #891: [RFC] Visual Regression Testing
  • #890: Split tests and static analysis into separate actions in the CI
  • #889: [RFC] Frontend 3D Model Support
  • #882: SSR Audio results page crashes
  • #875: Pages menu lacks focus styles
  • #866: Simplify media services
  • #860: Switching search type from an SSR’d audio results page to all content does not fetch all results
  • #856: Using the back button to navigate from the images search to the all content search results in only images showing
  • #835: Remove `query.mediaType` state property
  • #807: Strings with a value of “OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project.” should never be translated
  • #801: Unable to choose images on landing page, under content types on the search bar
  • #784: Blank layout doesn’t show the translation banner or CC referral banner
  • #759: HeaderHeader The header of your site is typically the first thing people will experience. The masthead or header art located across the top of your page is part of the look and feel of your website. It can influence a visitor’s opinion about your content and you/ your organization’s brand. It may also look different on different screen sizes. size in tablet
  • #761: Horizontal scroll issue on Openverse main page, when viewing license information.
  • #663: Browser back button doesn’t resubmit previous search
  • #558: Fix Audio e2e tests
  • #541: Add license definition in filterFilter Filters are one of the two types of Hooks https://codex.wordpress.org/Plugin_API/Hooks. They provide a way for functions to modify data of other functions. They are the counterpart to Actions. Unlike Actions, filters are meant to work in an isolated manner, and should never have side effects such as affecting global variables and output. sidebarSidebar A sidebar in WordPress is referred to a widget-ready area used by WordPress themes to display information that is not a part of the main content. It is not always a vertical column on the side. It can be a horizontal rectangle below or above the content area, footer, header, or any where in the theme.
  • #515: CC Migration banner doesn’t SSR
  • #468: External links NOT opening in the Openverse iframeiframe iFrame is an acronym for an inline frame. An iFrame is used inside a webpage to load another HTML document and render it. This HTML document may also contain JavaScript and/or CSS which is loaded at the time when iframe tag is parsed by the user’s browser.
  • #423: License explanation popup should close on click outside and be placed correctly

#openverse, #week-in-openverse

Community Meeting Recap (Feb 22nd)

Announcements

  • Next week’s meeting will be hosted by @stacimc, as we continue our hosting rotation amongst the sponsored OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project. developers.
  • SlackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. was having an ongoing outage during the course of this meeting, which caused some delays in communication.

Takeaways

Done

  • TSV loading is now performed at the end of provider APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. DAGs [ref]
  • Audio waveforms are now cached in the API database after being computed [ref]
  • Lots of movement on the frontend to remove dead code and general cleanup [ref]
  • A new VLink component which will help wrap internal and external links [ref]
  • Testing guidelines for the frontend! [ref]
  • Browser back button now behaves as expected [ref]

In progress

  • Numerous RFCs in need of review [ref]
  • A refactor of Media Services in the frontend [ref]
  • Changes to catalog DAG timeouts and retries [ref]
  • Improvements to E2E testing in the frontend [ref]
  • Slack completion message after a provider API DAG completes [ref]
  • Removing all external styles (some upcoming changes to this PR as well) [ref]
  • Decoupling waveform generation from the API by moving it out into a separate service [ref]
  • Native TypeScript support [ref]

Upcoming work

  • Catalog milestone v1.1.0 is very near completion [ref], and v1.2.0 will be underway soon [ref]
  • Improvements to automated accessibilityAccessibility Accessibility (commonly shortened to a11y) refers to the design of products, devices, services, or environments for people with disabilities. The concept of accessible design ensures both “direct access” (i.e. unassisted) and “indirect access” meaning compatibility with a person’s assistive technology (for example, computer screen readers). (https://en.wikipedia.org/wiki/Accessibility) testing via Vuetensils [ref]
  • A new RFC + milestone for application monitoring [ref]
  • Some discussion around moving openverse-frontend into the openverse repository as the first step towards a monorepo [ref]

✨ That’s all for now ✨

#openverse-weekly-community-meeting

#openverse, #openverse-weekly-community-meeting

A week in Openverse: 2022-02-14 – 2022-02-21

openverse-catalog

Merged PRs

  • #362: Use Airflow Variables for storing APIAPI An API or Application Programming Interface is a software intermediary that allows programs to interact with each other and share data in limited, clearly defined ways. keys
  • #360: Add provider media type to DAG tags
  • #359: Differentiate between slackSlack Slack is a Collaborative Group Chat Platform https://slack.com/. The WordPress community has its own Slack Channel at https://make.wordpress.org/chat/. channels
  • #357: Trigger TSV loading immediately after workflow

Closed issues

  • #346: Verify Freesound data
  • #323: Add DAG tags for media types
  • #285: Trigger TSV loader as a subDAG or DAG run
  • #209: Move API keys from .env to Airflow Variables

openverse-api

Merged PRs

  • #519: Remove pipdeptree
  • #518: Update Quickstart guide with troubleshooting tips
  • #515: Bump boto3 from 1.20.26 to 1.20.54 in /ingestion_server
  • #514: Bump spectree from 0.6.8 to 0.7.3 in /analytics
  • #513: Bump locust from 2.5.1 to 2.8.2 in /api
  • #511: Bump django from 3.2.9 to 3.2.12 in /api
  • #510: Cache waveform data in database
  • #509: Add quickstart and API documentation to README.md
  • #501: Bump ipython from 7.31.0 to 8.0.1 in /api
  • #495: Bump confluent-kafka from 1.7.0 to 1.8.2 in /analytics
  • #493: Bump pytest-order from 1.0.0 to 1.0.1 in /ingestion_server
  • #492: Bump sqlalchemy from 1.4.29 to 1.4.31 in /analytics
  • #459: Bump requests from 2.26.0 to 2.27.1 in /analytics

Closed issues

  • #517: Add troubleshooting tips to the Quickstart docs
  • #516: Remove `pipdeptree`
  • #490: Cache waveform data
  • #488: Reference quickstart guide in top level README.md

openverse-frontend

Merged PRs

  • #912: Re-add the viewport metaMeta Meta is a term that refers to the inside workings of a group. For us, this is the team that works on internal WordPress sites like WordCamp Central and Make WordPress. tag
  • #911: Use actual image source instead of foreign landing url as single result `og:image` path
  • #908: Remove outdated labelling instructions
  • #905: Remove unused Google Analytics code
  • #902: Remove old seo tags reintroduced via bad commit
  • #900: Disable friendly errors overlay
  • #897: Remove excessive padding from VContentItem
  • #894: Rename NoticeBar and MigrationNotice components
  • #886: Add TESTING_GUIDELINES.md initial draft
  • #883: Remove footer and associated styles and i18n strings
  • #872: Remove unused PhotoTags component
  • #870: Fix inline popover content taking up space
  • #869: Use single function to fetch single media item
  • #865: Add tags in image single result page
  • #864: Fix text colors
  • #863: Remove unused store modules
  • #861: Duplicate button classnames for increased specificity
  • #852: Fix search bar button crashing index page
  • #849: Fix global layout issues
  • #819: Fix homepage searchbar appearance when focused in mobile safari
  • #818: (#692) Fix duration mismatch between audio and metadata
  • #744: Audio single result refinements

Closed issues

  • #874: Back navigation after applying filters does not fetch new results
  • #862: Remove unused store modules
  • #859: Only show “load more results” when there are more results to load
  • #825: About page is broken
  • #817: OpenverseOpenverse Openverse is a search engine for openly-licensed media, including photos, audio, and video. Openverse is also the name for the collection of related code repositories that make up the project. logo link is broken on search pages
  • #802: Update meta tags to reflect wp.org
  • #787: Explore visual regression testing
  • #782: Unguarded `sessionStorage` access fails on certain privacy settings
  • #781: Single audio result view has horizontal scroll on mobile
  • #742: Adding tags in image single result page
  • #741: Audio download button should have rounded right corners when there’s no dropdown
  • #740: Single audio result view cleanup
  • #708: Wrong color in texts
  • #692: Audio waveform progress bar extends beyond the end of the waveform body
  • #671: Single audio view lacks ‘return to results’ link
  • #576: Search results title
  • #554: Add E2E tests to CI
  • #544: Audio result page responsiveness is messy
  • #495: Rename components to names with `V` prefix
  • #460: Decouple AudioController from `audio` in favour of headless `Audio`
  • #248: [Bug] Information list looks strange with long names (German)

#openverse, #week-in-openverse