code.forgejo.org: database ping failed with error: context canceled #61

Closed
opened 2025-05-29 20:12:43 +00:00 by GhostPLACEHOLDER · 3 comments

Happened once and no notes were taken. Recording the incidents here for forensic purposes:

xorm

SQLite

v10 & v11 upgrades

as of 17 April 2025 both v11 & v12-dev run go-sqlite v1.14.28 / SQLite 3.49.1https://codeberg.org/forgejo/forgejo/pulls/7563 - f76bae4b00.

Nothing in the changelog since SQLite 3.46.1 suggests a fix related to this problem https://sqlite.org/changes.html

Happened once and no notes were taken. Recording the incidents here for forensic purposes: - https://code.forgejo.org/infrastructure/k8s-cluster/issues/420 ## xorm - using the latest version as of v10.0.0 and above - no ping related issue reports https://gitea.com/xorm/xorm/issues?state=open&type=all&labels=&milestone=0&project=0&assignee=&poster=&q=ping - no PR merged in the dev branch that is related to that particular problem ## SQLite - https://code.forgejo.org/infrastructure/k8s-cluster/issues/420 happened with go-sqlite v1.14.24 / SQLite 3.46.1 ## v10 & v11 upgrades as of 17 April 2025 both v11 & v12-dev run go-sqlite v1.14.28 / SQLite 3.49.1https://codeberg.org/forgejo/forgejo/pulls/7563 - https://github.com/mattn/go-sqlite3/commit/f76bae4b0044cbba8fb2c72b8e4559e8fbcffd86. Nothing in the changelog since SQLite 3.46.1 suggests a fix related to this problem https://sqlite.org/changes.html

Given that SQLite connection ping is about opening a session to a file with no network connection, it could only timeout for two reasons:

  • a deadlock
  • an I/O problem

Although it was observed on a DRBD mounted device, the I/O problem is ruled out because:

  • it was observed to last during hours
  • during the same time other Forgejo instances running out of the same DRBD mounted volume ran fine

Which leaves the deadlock. It seems to happen only when the instance slows down for some reason. Maybe having slow SQL queries increases the likelyhood of the deadlock and it sometime manifests itself.

Given that SQLite connection ping is about opening a session to a file with no network connection, it could only timeout for two reasons: - a deadlock - an I/O problem Although it was observed on a DRBD mounted device, the I/O problem is ruled out because: - it was observed to last during hours - during the same time other Forgejo instances running out of the same DRBD mounted volume ran fine Which leaves the deadlock. It seems to happen only when the instance slows down for some reason. Maybe having slow SQL queries increases the likelyhood of the deadlock and it sometime manifests itself.

If that happens again it would be best documented & tracked publicly in Forgejo itself.

If that happens again it would be best documented & tracked publicly in Forgejo itself.
Followup https://codeberg.org/forgejo/forgejo/issues/7741
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: infrastructure/k8s-cluster#61
No description provided.