Updating the database fails silently for SHA256 repos #152

Open
opened 2026-01-19 12:03:35 +00:00 by fkobi · 0 comments
Owner

The current database schema allows for 40 char (SHA1) hashes:

hash: str = db.Column(db.String(40))

When one tries to add a repository using SHA256, so 64 char hashes, the software appears to be functioning correctly:

root@cont2-noris:~# docker logs reuse-api 2>&1 | grep fkobi
[2026-01-19 12:00:26,432] DEBUG: Scheduling codeberg.org/fkobi/sd2
[2026-01-19 12:00:26,717] DEBUG: Repo outdated: codeberg.org/fkobi/sd2
[2026-01-19 12:00:26,717] INFO: Task enqueued: codeberg.org/fkobi/sd2
[2026-01-19 12:00:26,718] DEBUG: linting 'codeberg.org/fkobi/sd2'
[2026-01-19 12:00:29,517] DEBUG: finished linting 'codeberg.org/fkobi/sd2' return code is 0
[parameters: {'hash': '88e6d6a34e247937846853d6dd355f0549f35b150d33c4c01347cbdf76550310', 'status': <Status.OK: 'compliant'>, 'lint_code': 0, 'lint_output': '# SUMMARY\n\n* Bad licenses: 0\n* Deprecated licenses: 0\n* Licenses without file extension: 0\n* Missing licenses: 0\n* Unused licenses: 0\n* Used l ... (118 characters truncated) ... ion: 14 / 14\n* Files with license information: 14 / 14\n\nCongratulations! Your project is compliant with version 3.3 of the REUSE Specification :-)', 'spdx_output': 'SPDXVersion: SPDX-2.1\nDataLicense: CC0-1.0\nSPDXID: SPDXRef-DOCUMENT\nDocumentName: project\nDocumentNamespace: http://spdx.org/spdxdocs/spdx-v2.1-3 ... (5252 characters truncated) ... 46cf072af\nLicenseConcluded: NOASSERTION\nLicenseInfoInFile: EUPL-1.2\nFileCopyrightText: <text>SPDX-FileCopyrightText: 2026 Filip Kobierski</text>\n', 'last_access': datetime.datetime(2026, 1, 19, 12, 0, 29, 541275), 'repository_url': 'codeberg.org/fkobi/sd2'}]

while in reality it fails to update the database since the size of the field is too big:

2026-01-19 12:00:29.545 UTC [45082] ERROR:  value too long for type character varying(40)
2026-01-19 12:00:29.545 UTC [45082] STATEMENT:  UPDATE repository SET hash='88e6d6a34e247937846853d6dd355f0549f35b150d33c4c01347cbdf76550310', [...]

This is a high priority issue as git:

  1. will use SHA256 by default in 3.0 which is planned to release by 2027 and future
  2. 2.52 has started working on introducing interoperability so SHA256 commits will start appearing in SHA1 repos
  3. SHA256 is fully functional now and new repositories may choose to use it
The current database schema allows for 40 char (SHA1) hashes: https://git.fsfe.org/reuse/api/src/commit/ed08d1876cca9a8c135307ad6c0aa0345909c28d/reuse_api/models.py#L46 When one tries to add a repository using SHA256, so 64 char hashes, the software appears to be functioning correctly: ```console root@cont2-noris:~# docker logs reuse-api 2>&1 | grep fkobi [2026-01-19 12:00:26,432] DEBUG: Scheduling codeberg.org/fkobi/sd2 [2026-01-19 12:00:26,717] DEBUG: Repo outdated: codeberg.org/fkobi/sd2 [2026-01-19 12:00:26,717] INFO: Task enqueued: codeberg.org/fkobi/sd2 [2026-01-19 12:00:26,718] DEBUG: linting 'codeberg.org/fkobi/sd2' [2026-01-19 12:00:29,517] DEBUG: finished linting 'codeberg.org/fkobi/sd2' return code is 0 [parameters: {'hash': '88e6d6a34e247937846853d6dd355f0549f35b150d33c4c01347cbdf76550310', 'status': <Status.OK: 'compliant'>, 'lint_code': 0, 'lint_output': '# SUMMARY\n\n* Bad licenses: 0\n* Deprecated licenses: 0\n* Licenses without file extension: 0\n* Missing licenses: 0\n* Unused licenses: 0\n* Used l ... (118 characters truncated) ... ion: 14 / 14\n* Files with license information: 14 / 14\n\nCongratulations! Your project is compliant with version 3.3 of the REUSE Specification :-)', 'spdx_output': 'SPDXVersion: SPDX-2.1\nDataLicense: CC0-1.0\nSPDXID: SPDXRef-DOCUMENT\nDocumentName: project\nDocumentNamespace: http://spdx.org/spdxdocs/spdx-v2.1-3 ... (5252 characters truncated) ... 46cf072af\nLicenseConcluded: NOASSERTION\nLicenseInfoInFile: EUPL-1.2\nFileCopyrightText: <text>SPDX-FileCopyrightText: 2026 Filip Kobierski</text>\n', 'last_access': datetime.datetime(2026, 1, 19, 12, 0, 29, 541275), 'repository_url': 'codeberg.org/fkobi/sd2'}] ``` while in reality it fails to update the database since the size of the field is too big: ```log 2026-01-19 12:00:29.545 UTC [45082] ERROR: value too long for type character varying(40) 2026-01-19 12:00:29.545 UTC [45082] STATEMENT: UPDATE repository SET hash='88e6d6a34e247937846853d6dd355f0549f35b150d33c4c01347cbdf76550310', [...] ``` --- This is a high priority issue as git: 1. will use SHA256 by default in 3.0 [which is planned to release by 2027](https://lore.kernel.org/all/aNxivuJEnSHbQNdr@fruit.crustytoothpaste.net/) and future 2. [2.52 has started working on introducing interoperability](https://github.com/git/git/blob/v2.52.0/Documentation/RelNotes/2.52.0.adoc) so SHA256 commits will start appearing in SHA1 repos 3. SHA256 is fully functional now and new repositories may choose to use it
fkobi added the bug
component
database
prio
high
1.0
labels 2026-01-19 12:03:35 +00:00
fkobi added the due date 2027-01-01 2026-01-19 12:04:32 +00:00
Sign in to join this conversation.