Store data in PostgreSQL instead of JSON file #25
Currently, each email sent is logged into a JSON file, including the complete content. Since JSON is a file format to which new records cannot simply be appended, for each email sent, the whole logfile is read into memory, parsed, then the new record is appended, then it's converted back to JSON and the whole file is newly written.
Obviously, this is very inefficient when we reach thousands of log entries.
I suggest to
- reevaluate whether we really want the whole content in the logfile
- consider to use a different file format where new records can be added by simply appending to the file, like YAML or just simple CSV.
For the PMPC open letter, we definitely need a machine-readable format to parse the different inputs of a form. Of course, we could restrain from logging the whole email content (so the text body) to the JSON file.
For publiccode.eu, I don't really care whether the actual file is YAML, JSON or CSV. I personally love JSON because it's less error-prone than YAML and has more features than CSV (like arrays).
Meanwhile, I've come to the conclusion that an SQL database (probably SQLite) would be best, considering that for every existing new registration all previous registrations are scanned for potential duplicates.
Excellent! However, we make some use of the JSON files, e.g. for publiccode.eu and perhaps also the REUSE API. Would we have to rewrite these accordingly?
I would suggest that we log to both targets (JSON and SQLite) for some time until all systems consuming the logs can be migrated with no haste. Would you consider SQLite an useful data source for systems like publiccode.eu?
SQLite in general is a nice solution since it has a low complexity, so a +1 on that.
Currently, the pmpc site'S CMS hugo can read the JSON file directly, and there is no such feature for SQLite (at least native). But if it was possible to convert SQLite to JSON in the container, I wouldn't see a big problem with it.
if it was possible to convert SQLite to JSON in the container
Do you mean in the forms container or in the pmpc container?
I would really like to rename this to "Store data in PostgreSQL instead of JSON file" as we would then slowly but surely converge on our Flask - SQLAlchemy - PostgreSQL Stack across multiple projects (REUSE API coming soon).
No due date set.
No dependencies set.
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?