Migrate from GitHub / AWS CodeCommit to Forgejo

This topic explains how to migrate your repositories from GitHub and AWS CodeCommit to the European alternative Forgejo. It was written based upon the migration of approximately 100 GitHub and AWS CodeCommit git-repositories to Forgejo.

Forgejo is very similar to GitHub for process management, and Forgejo is identical to GitHub and AWS CodeCommit for git version control (AWS CodeCommit has no serious process management features). Forgejo is made possible by Codeberg, a registered non-profit association based in Berlin, Germany, and is open source. Forgejo originated as a hard-fork from Gitea.

Version Control at Invantive

Invantive has been using version control to manage the software development and delivery process. It started of on RCS, to later use CVS and then Subversion. Since approximately 2015 Git is being used.

There are approximately 100 software repositories. Some very small and some huge in both number of files and size.

During the migration around 2015, Git was not sufficiently mature. Various Git-based products were not able to handle the load, but things improved over time. Originally running on a dedicated GitLab-server, Invantive later migrated to AWS CodeCommit due to costs and reliability of a managed solution. AWS desupported AWS CodeCommit on June 2024, although existing users could continue use it. Invantive then moved a part of the Git-load to GitHub, but it was found to be less mature than AWS CodeCommit in terms of reliability and only part of the repositories have been migrated.

With the ascension of a new administration in the US, Invantive has re-evaluated it’s positions and risks and decided to re-source IT-components from European companies or at least companies not under US control.

European Alternatives for GitHub and AWS CodeCommit

A number of alternatives have been studied, starting from a web page with European alternatives:

  • GitLab: rejected, given the company behind it must conform to US laws.
  • Codeberg: rejected, since only for open source/public projects.
  • Gitea: rejected, given the company behind it must conform to US laws.
  • Planio: studied, since despite +1 phone number it is located in Berlin (DE) and is not under US jurisdiction. UI little outdated and quite expensive for larger volumes of Git-data (80+ GB EUR 349/month).
  • Codey: tested, but rejected due currently missing SSH integration. Although Codey is a Swiss company with excellent on-boarding experience which offers a SaaS version of Codeberg/Forgejo for a low price, they currently only offer HTTPS for data exchange. Otherwise, the management options are all there. From experience it is known that Invantive’s repositories can not work reliably over HTTPS, so it was put aside for now.
  • Stackit: tested, but rejected due to young age. Based upon Codeberg/Forgejo, Stackit - part of the Schwarz Gruppe known for Lidl - offers free Git and process management. However, given the young age, only English and absence of management options, the option was rejected.
  • Forgejo: tested, and chosen. Forgejo can be self-hosted easily on a Linux server, and met all requirements in terms of reliability and usability. Additionally, the migration from GitHub in terms of issue management was easy for the users. Costs are a few Euros monthly for a server and optionally costs to allow the Codeberg organisation to improve the product.

Warning! Both “forgejo” and “codey.ch” are really hard to memorize names. Take a break and practice the words in your mind!

Installation

Initially, a test server was created and the migration, reliability and usability were tested. The installation and configuration of the production server took approximately 1 hour. It was based on Ubuntu 24.04 with 128 GB of hard disk storage, 2 ARM64 CPUs and 8 GB of memory. A smaller server might fit the needs also as shown in the performance section.

Hardest part was the configuration of SSL since it referred to CUSTOM_PATH in the documentation version of March 2025 where it should read CustomPath. When you read this, the PR to correct the documentation probably has been accepted already.

After creating the folder /var/lib/forgejo/custom/https, an Apache key file was stored there as key.pem and a cat MOST_PRECISE.crt LESS_PRECISE.crt SOME_INTERMEDIATE_CA.pem >cert.pem created the needed cert.pem.

Configuration /etc/forgejo/app.ini

The configuration file was most notably changed with:

Section Aspect Value Reason
server MAX_FILES 100 Allow 100 files to be manually uploaded in one go.
git.timeout DEFAULT 3600 Allow for more time given size of some repositories.
git.timeout MIGRATE 7200 Dito.
git.timeout MIRROR 3600 Dito.
git.timeout CLONE 3600 Dito.
git.timeout PULL 3600 Dito.
git.timeout GC 600 Dito.

Further host names and mailer where updated.

Migration Script

The following MS-DOS script can be used to export and import all files, releases and tags, with the repository names to be migrated listed one per line in projects.txt:

echo Migrate GitHub and/or AWS CodeCommit to Forgejo

set SRC=git@github.com:acme/
set TGT=git@forgejo.invantive.com:acme/

for /f %%G in (projects.txt) do (
  echo Migrate %%G: create target repository in Forgejo.
  pause
  git clone --bare %SRC%%%G.git
  cd %%G.git
  git lfs fetch --all
  git push --mirror %TGT%%%G.git
  git lfs push --all %TGT%%%G.git
  cd ..
)

When completed, the files exported can be saved offline for use in case of emergencies.

Migration Performance

The import on Forgejo is quick: a repository of 1000 MB is done in minutes. However, import of tags is extremely slow. 1000 tags can take between 10 and 30 minutes.

Altogether, the migration took 24 clock hours. One specific repository took over 8 hours in the git lfs push step.

This required the team to not exchange data with GitHub or AWS CodeCommit. After completion of the migration, they moved their old local repositories to a different location and pulled the repositories from Forgejo. After that, they had to manually copy the changed files in the Forgejo-based repositories.

Importing of repositories was done in parallel with 12 streams. On 2 ARM64 CPU with 8 GB of memory, the CPU statistics were (sar):

CPU     %user     %nice   %system   %iowait    %steal     %idle
all     49.23      0.00     17.96      0.02      0.42     32.37
all     43.79      0.00     35.79      4.67      0.61     15.15
all     55.28      0.00     26.23      0.17      0.66     17.66
all     38.79      0.00     37.48      4.17      0.68     18.89
all     67.74      0.00     31.33      0.30      0.17      0.46
all     34.96      0.00     29.03      4.90      0.78     30.32
all     23.84      0.00     15.59      0.55      0.89     59.14
all     62.21      0.00     28.32      0.08      0.32      9.08
all     64.38      0.00     35.31      0.07      0.15      0.09
all     53.81      0.00     39.29      1.46      0.30      5.13
all     42.01      0.00     32.69      1.53      0.61     23.16
all     31.43      0.00     21.17      0.34      0.85     46.22
all     18.38      0.00     11.99      0.20      0.79     68.64
all     18.02      0.00     11.87      0.45      1.00     68.66
all     16.01      0.00     10.76      0.65      1.03     71.56
all     20.56      0.00     12.16      0.70      1.18     65.40
all     15.31      0.00      9.42      1.63      1.34     72.30
all     25.64      0.00     12.20      1.06      1.07     60.03
all     36.55      0.00     15.30      0.44      0.90     46.81
all     32.52      0.00     15.00      1.30      0.95     50.22

During most of this time, there was little to no delay due to memory exhaustion (sar -B):

pgpgin/s pgpgout/s   fault/s  majflt/s  pgfree/s pgscank/s pgscand/s pgsteal/s    %vmeff
    0.00    436.94 147077.72      0.00 104741.90      0.00      0.00      0.00      0.00
    0.00   4227.42 123593.83      0.00  87216.09      0.00      0.00      0.00      0.00
    0.00    692.81 170904.34      0.00 122163.79      0.00      0.00      0.00      0.00
    0.00   3390.46 109143.95      0.01  77160.24      0.00      0.00      0.00      0.00
    0.01   2372.41 207534.26      0.00 147420.46      0.00      0.00      0.00      0.00
    0.00   3490.89  86366.82      0.00  60758.51      0.00      0.00      0.00      0.00
    0.00   1843.37  58694.50      0.00  43734.04      0.00      0.00      0.00      0.00
    0.22   1930.63 186726.37      0.00 133890.25      0.00      0.00      0.00      0.00
    0.00   2825.55 197841.44      0.01 140980.54      0.00      0.00      0.00      0.00
    0.00   3808.82 149030.41      0.00 107025.04      0.00      0.00      0.00      0.00
    1.53   3315.53 114504.97      0.01  83442.76      0.00      0.00      0.00      0.00
    1.01   2000.82  83942.62      0.01  62193.65      0.00      0.00      0.00      0.00
    0.00    831.04  45812.53      0.00  33858.56      0.00      0.00      0.00      0.00
   12.11   2300.03  37601.07      0.06  28952.54    189.21      0.00    378.31    199.95
    0.13   2314.12  33677.60      0.00  26330.47    551.98      0.00   1103.94    200.00
    1.89   2486.64  39475.95      0.00  31023.66    625.77      0.00   1251.53    200.00
  711.08   3963.57  21283.64      0.13  19429.61   1236.70      0.00   2473.26    199.99
  983.07   2608.89  39582.99      0.19  34949.26    751.17      0.00   1474.12    196.24
 1307.57   1724.65  52438.97      0.04  42501.57    808.61      0.00   1444.59    178.65
 3329.08   1924.01  54935.25      0.59  46709.06   1306.80      0.00   2450.97    187.55

Given that some very large repositories where being imported at the end of these measurements, it is assumed that the paging occurring was caused by large repositories (multi-gigabyte).

Backups

A script similar to the clone and fetch steps in the migration script can be used for backups when solely the Git-contents need to be saved.

The alternative is to use forgejo dump to generate a huge and uniquely named zip-file such as forgejo-dump-123456789.zip, as in:

# systemctl stop forgejo
# su - git
$ cd /MY_BACKUP_FOLDER
$ forgejo dump --config /etc/forgejo/app.ini
# exit
# systemctl start forgejo

forgejo dump runs very fast, generating approximately 4 GB of zip-data per minute. The resulting zip-file contains both repositories as well as the database, configuration file and data folder.

Functional Implementation

Once transitioned into production, the following steps were executed:

  • Create organizations as needed to group the repositories.
  • Create users and put them in teams.
  • Assign teams the necessary privileges, including which repositories are accessible.
  • Migrate the Wikis by hand from GitHub since there are some small differences.
  • Migrate and structure the open issues by hand. Issue history is dumped to a text file using the Invantive UniversalSQL GitHub-driver.
  • Migrate the large number of labels by hand.

At Invantive, the software build process is intentionally done outside of the version control software to ease migrations, so no CI/CD.