Store data in PostgreSQL instead of JSON file #25

Open
opened 2019-02-12 15:01:16 +00:00 by reinhard · 8 comments
Owner

Currently, each email sent is logged into a JSON file, including the complete content. Since JSON is a file format to which new records cannot simply be appended, for each email sent, the whole logfile is read into memory, parsed, then the new record is appended, then it's converted back to JSON and the whole file is newly written.

Obviously, this is very inefficient when we reach thousands of log entries.

I suggest to

  1. reevaluate whether we really want the whole content in the logfile
  2. consider to use a different file format where new records can be added by simply appending to the file, like YAML or just simple CSV.
Currently, each email sent is logged into a JSON file, including the complete content. Since JSON is a file format to which new records cannot simply be appended, for each email sent, the whole logfile is read into memory, parsed, then the new record is appended, then it's converted back to JSON and the whole file is newly written. Obviously, this is very inefficient when we reach thousands of log entries. I suggest to 1. reevaluate whether we really want the whole content in the logfile 2. consider to use a different file format where new records can be added by simply appending to the file, like YAML or just simple CSV.
Owner

For the PMPC open letter, we definitely need a machine-readable format to parse the different inputs of a form. Of course, we could restrain from logging the whole email content (so the text body) to the JSON file.

For publiccode.eu, I don't really care whether the actual file is YAML, JSON or CSV. I personally love JSON because it's less error-prone than YAML and has more features than CSV (like arrays).

For the PMPC open letter, we definitely need a machine-readable format to parse the different inputs of a form. Of course, we could restrain from logging the whole email content (so the text body) to the JSON file. For publiccode.eu, I don't really care whether the actual file is YAML, JSON or CSV. I personally love JSON because it's less error-prone than YAML and has more features than CSV (like arrays).
Author
Owner

Meanwhile, I've come to the conclusion that an SQL database (probably SQLite) would be best, considering that for every existing new registration all previous registrations are scanned for potential duplicates.

Meanwhile, I've come to the conclusion that an SQL database (probably SQLite) would be best, considering that for every existing new registration all previous registrations are scanned for potential duplicates.
Owner

Excellent! However, we make some use of the JSON files, e.g. for publiccode.eu and perhaps also the REUSE API. Would we have to rewrite these accordingly?

Excellent! However, we make some use of the JSON files, e.g. for publiccode.eu and perhaps also the REUSE API. Would we have to rewrite these accordingly?
Author
Owner

I would suggest that we log to both targets (JSON and SQLite) for some time until all systems consuming the logs can be migrated with no haste. Would you consider SQLite an useful data source for systems like publiccode.eu?

I would suggest that we log to both targets (JSON and SQLite) for some time until all systems consuming the logs can be migrated with no haste. Would you consider SQLite an useful data source for systems like publiccode.eu?
Owner

SQLite in general is a nice solution since it has a low complexity, so a +1 on that.

Currently, the pmpc site'S CMS hugo can read the JSON file directly, and there is no such feature for SQLite (at least native). But if it was possible to convert SQLite to JSON in the container, I wouldn't see a big problem with it.

SQLite in general is a nice solution since it has a low complexity, so a +1 on that. Currently, the pmpc site'S CMS hugo can read the JSON file directly, and there is no such feature for SQLite (at least native). But if it was possible to convert SQLite to JSON in the container, I wouldn't see a big problem with it.
Author
Owner

if it was possible to convert SQLite to JSON in the container

Do you mean in the forms container or in the pmpc container?

> if it was possible to convert SQLite to JSON in the container Do you mean in the forms container or in the pmpc container?
Owner

The PMPC container. Sorry for being so unclear :/

The PMPC container. Sorry for being so unclear :/
reinhard self-assigned this 2019-08-02 12:47:07 +00:00
reinhard changed title from Logging into a JSON file is highly inefficient to Store data in SQLite database instead of JSON file 2019-08-02 13:52:36 +00:00
reinhard removed their assignment 2020-03-23 14:43:03 +00:00
Owner

I would really like to rename this to "Store data in PostgreSQL instead of JSON file" as we would then slowly but surely converge on our Flask - SQLAlchemy - PostgreSQL Stack across multiple projects (REUSE API coming soon).

I would really like to rename this to "Store data in PostgreSQL instead of JSON file" as we would then slowly but surely converge on our Flask - SQLAlchemy - PostgreSQL Stack across multiple projects (REUSE API coming soon).
linus changed title from Store data in SQLite database instead of JSON file to Store data in PostgreSQL instead of JSON file 2022-11-16 11:37:08 +00:00
linus added the
enhancement
prio:high
labels 2022-12-19 12:02:09 +00:00
linus added this to the 2023-1 milestone 2023-03-14 09:14:09 +00:00
florian.vuillemot self-assigned this 2023-07-10 20:17:03 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: fsfe-system-hackers/forms#25
No description provided.