Tags for languages and countries are not separated #954

Zamknięty
otworzone 2019-05-26 16:17:23 +00:00 przez jzarl · 11 komentarzy
Członek

Looking over the tags it seems that there is no clear distinction between ISO 639-1 language codes and ISO 3166-1 country codes.
My understanding is that two-letter codes in tags are supposed to be country codes (language codes are embedded in the file name and subject to translation).

E.g. there are several Austrian events tagged as "de".

P.S.: As a side note, the tag for Great Britain/United Kingdom is "uk" instead of the correct "gb" (which is my fault).

Looking over the tags it seems that there is no clear distinction between ISO 639-1 language codes and ISO 3166-1 country codes. My understanding is that two-letter codes in tags are supposed to be country codes (language codes are embedded in the file name and subject to translation). E.g. there are several Austrian events tagged as "de". P.S.: As a side note, the tag for Great Britain/United Kingdom is "uk" instead of the correct "gb" (which is my fault).
Członek

I agree that it makes more sense to tag by country than by language.

I agree that it makes more sense to tag by country than by language.
Właściciel

Full ack, and that's how we automatically do it for the events registered with the tool. Please have a look at the tool's country drop-down to see the country codes.

The the wrongly tagged events probably stem from manual edits or earlier policies I am not aware of.

Full ack, and that's how we automatically do it for the events registered with the tool. Please have a look at the [tool's](https://fsfe.org/community/tools/eventregistration.html) country drop-down to see the country codes. The the wrongly tagged events probably stem from manual edits or earlier policies I am not aware of.
max.mehl dodano
tagging
etykietę 2019-05-27 08:47:35 +00:00
Właściciel

Partially fixed with @jzarl's #966

Partially fixed with @jzarl's #966
Członek

Is it realistic to remove all language-related tags and, if needed, change them into country-specific tags? Because I guess we all agree that this is what we want, but who can do it?

Is it realistic to remove all language-related tags and, if needed, change them into country-specific tags? Because I guess we all agree that this is what we want, but who can do it?
Właściciel

I think it's realistic in a sense that we must to it some day. The tags are quite useless in their current state.

I wonder about the how though. In the past I did some smaller unification attempts, but they have been quite manual. Is there a way to create a file where we can define tags to be deleted or changed to something else, and then just run the whole thing? Not sure whether @jzarl's tool is exactly doing that already...

I think it's realistic in a sense that we *must* to it some day. The tags are quite useless in their current state. I wonder about the *how* though. In the past I did some smaller unification attempts, but they have been quite manual. Is there a way to create a file where we can define tags to be deleted or changed to something else, and then just run the whole thing? Not sure whether @jzarl's tool is exactly doing that already...
Author
Członek

My script started with exactly that. Alas, I assumed that nobody wants to tediously define a .csv file in order to run a batch job and removed this mode when updating the tool to the new tag syntax.

IMO, though, we didn't really lose anything when I removed the bulk mode. You can still do something like the following:

tools/tagtool/tagtool.sh --remove-tags $deprecated_tags

cat tags_to_rename.txt | {
while read oldTag newTag
do
  tools/tagtool/tagtool.sh --rename-tag "$oldTag" "$newTag"
done }
My script started with exactly that. Alas, I assumed that nobody wants to tediously define a .csv file in order to run a batch job and removed this mode when updating the tool to the new tag syntax. IMO, though, we didn't really lose anything when I removed the bulk mode. You can still do something like the following: ``` tools/tagtool/tagtool.sh --remove-tags $deprecated_tags cat tags_to_rename.txt | { while read oldTag newTag do tools/tagtool/tagtool.sh --rename-tag "$oldTag" "$newTag" done } ```
Author
Członek

The bigger problem to me seems to be /finding/ these tags, though.
Of course one can manually search for them, but that seems overly tedious.

Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags?

In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly.

Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...

The bigger problem to me seems to be /finding/ these tags, though. Of course one can manually search for them, but that seems overly tedious. Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags? In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly. Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...
Członek

Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags?

Actually I do think that this is an excellent idea.

In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly.

I agree that for past news and events, this script would lead to many warnings, but actually I think that all of these warnings would be legitimate and should be fixed.

Let's face it: we have a mess, and we won't get rid of that mess without investing some work :-/

It's a lot of work, BUT: it doesn't need deep knowledge of the build or tagging system, so we might find some people to help with this. I'd be ready to cover my share of the cleanup.

Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...

Another excellent idea.

> Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags? Actually I do think that this is an excellent idea. > In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly. I agree that for past news and events, this script would lead to many warnings, but actually I think that all of these warnings would be legitimate and should be fixed. Let's face it: we have a mess, and we won't get rid of that mess without investing some work :-/ It's a lot of work, BUT: it doesn't need deep knowledge of the build or tagging system, so we might find some people to help with this. I'd be ready to cover my share of the cleanup. > Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags... Another excellent idea.
Właściciel

Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...

Actually, we already have a warning in the pre-commit hook if someone introduces a completely new tag: https://git.fsfe.org/FSFE/fsfe-website/src/branch/master/tools/githooks/pre-commit#L84-L109

For everything else, I agree to Reinhard here

> Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags... Actually, we already have a warning in the pre-commit hook if someone introduces a completely new tag: https://git.fsfe.org/FSFE/fsfe-website/src/branch/master/tools/githooks/pre-commit#L84-L109 For everything else, I agree to Reinhard here
Członek

I think the idea was that the pre-commit hook could warn if the tags in the translation differ from the tags in the original.

I think the idea was that the pre-commit hook could warn if the tags in the translation differ from the tags in the original.
Właściciel

It seems that we've fixed the wrong tags with the major cleanups in the recent months.

I will work on the pre-commit hook and see how we can avoid a mismatch.

It seems that we've fixed the wrong tags with the major cleanups in the recent months. I will work on the pre-commit hook and see how we can avoid a mismatch.
max.mehl przypisuje to na siebie 2021-01-19 19:48:12 +00:00
max.mehl zamknął(-ęła) to zgłoszenie 2021-01-20 14:59:57 +00:00
Zaloguj się, aby dołączyć do tej rozmowy.
Brak kamienia milowego
Brak przypisanych
Uczestnicy 3
Powiadomienia
Termin realizacji
Data realizacji jest niewłaściwa lub spoza zakresu. Użyj formatu 'yyyy-mm-dd'.

Brak ustawionego terminu realizacji.

Zależności

No dependencies set.

Reference: FSFE/fsfe-website#954
No description provided.