Commons:Dumps and backups
This page is intended to be the central place for public information about Wikimedia Commons data dumps and backups.
Dumps
[edit]Media files
[edit]As of 2024, there are no publicly available dumps of media files for download since about 2013. There is a request and an open Phabricator ticket to resume them again.
Wikimedia Commons wiki content
[edit]All Wikimedia Commons wiki pages, including all of their past revisions in their History (excluding deleted ones), are included in XML dumps, which are generated on a regular basis, and publicly available for download at https://dumps.wikimedia.org/commonswiki/.
Backups
[edit]Media files
[edit]All media files (including their past versions) in Wikimedia Commons are backuped in dedicated servers in both Wikimedia Foundation application data centers: Eqiad (Ashburn, Virginia, USA) and Codfw (Carrollton, Texas, USA). These backups are not accessible for the public (including registered Wikimedia users), only to Wikimedia Foundation staff, since they include deleted files and other data that can't be publicly accessible. Backups at each data center are fully independent from each other for redundancy reasons.
Wikimedia Commons wiki content
[edit]Wiki content of all Wikimedia wikis (including Commons), which is stored in MariaDB databases, is also backuped in both Wikimedia Foundation application data centers. Those backups also include full version history for all pages.
History
[edit]Before 2014, when a second facility for redundancy came online, all Wikimedia sites operated from a single application data center (there were, as there are now, more data centers for caching and optimal content distribution, but without any permanent data storage). While having geographical redundancy and XML text dumps in place, no true text-only database backups were implemented until 2020-2021, after several years of work. Backups for media files weren't in place until 2021-2022. Offline backups (for example, on tape) were featured as "coming next" in a Wikimedia Foundation presentation (slide 48), but there is no explicit mention of them in Wikitech nor Phabricator.
Sources
[edit]- Phabricator ticket: Produce regular public dumps of Commons media files
- Phabricator ticket: WMF media storage must be adequately backed up
- Phabricator ticket: Set up backup strategy for es clusters
- Data centers (Wikitech)
- Media storage/Backups (Wikitech)
- MariaDB/Backups (Wikitech)
- Wikimedia Foundation selects CyrusOne in Dallas as new data center