Tags for languages and countries are not separated #954

Slēgta
jzarl atvēra 2019-05-26 16:17:23 +00:00 · 11 komentāri
Biedri

Looking over the tags it seems that there is no clear distinction between ISO 639-1 language codes and ISO 3166-1 country codes.
My understanding is that two-letter codes in tags are supposed to be country codes (language codes are embedded in the file name and subject to translation).

E.g. there are several Austrian events tagged as "de".

P.S.: As a side note, the tag for Great Britain/United Kingdom is "uk" instead of the correct "gb" (which is my fault).

Looking over the tags it seems that there is no clear distinction between ISO 639-1 language codes and ISO 3166-1 country codes. My understanding is that two-letter codes in tags are supposed to be country codes (language codes are embedded in the file name and subject to translation). E.g. there are several Austrian events tagged as "de". P.S.: As a side note, the tag for Great Britain/United Kingdom is "uk" instead of the correct "gb" (which is my fault).
Biedri

I agree that it makes more sense to tag by country than by language.

I agree that it makes more sense to tag by country than by language.
Īpašnieks

Full ack, and that's how we automatically do it for the events registered with the tool. Please have a look at the tool's country drop-down to see the country codes.

The the wrongly tagged events probably stem from manual edits or earlier policies I am not aware of.

Full ack, and that's how we automatically do it for the events registered with the tool. Please have a look at the [tool's](https://fsfe.org/community/tools/eventregistration.html) country drop-down to see the country codes. The the wrongly tagged events probably stem from manual edits or earlier policies I am not aware of.
max.mehl pievienoja
tagging
etiķeti 2019-05-27 08:47:35 +00:00
Īpašnieks

Partially fixed with @jzarl's #966

Partially fixed with @jzarl's #966
Biedri

Is it realistic to remove all language-related tags and, if needed, change them into country-specific tags? Because I guess we all agree that this is what we want, but who can do it?

Is it realistic to remove all language-related tags and, if needed, change them into country-specific tags? Because I guess we all agree that this is what we want, but who can do it?
Īpašnieks

I think it's realistic in a sense that we must to it some day. The tags are quite useless in their current state.

I wonder about the how though. In the past I did some smaller unification attempts, but they have been quite manual. Is there a way to create a file where we can define tags to be deleted or changed to something else, and then just run the whole thing? Not sure whether @jzarl's tool is exactly doing that already...

I think it's realistic in a sense that we *must* to it some day. The tags are quite useless in their current state. I wonder about the *how* though. In the past I did some smaller unification attempts, but they have been quite manual. Is there a way to create a file where we can define tags to be deleted or changed to something else, and then just run the whole thing? Not sure whether @jzarl's tool is exactly doing that already...
Autors
Biedri

My script started with exactly that. Alas, I assumed that nobody wants to tediously define a .csv file in order to run a batch job and removed this mode when updating the tool to the new tag syntax.

IMO, though, we didn't really lose anything when I removed the bulk mode. You can still do something like the following:

tools/tagtool/tagtool.sh --remove-tags $deprecated_tags

cat tags_to_rename.txt | {
while read oldTag newTag
do
  tools/tagtool/tagtool.sh --rename-tag "$oldTag" "$newTag"
done }
My script started with exactly that. Alas, I assumed that nobody wants to tediously define a .csv file in order to run a batch job and removed this mode when updating the tool to the new tag syntax. IMO, though, we didn't really lose anything when I removed the bulk mode. You can still do something like the following: ``` tools/tagtool/tagtool.sh --remove-tags $deprecated_tags cat tags_to_rename.txt | { while read oldTag newTag do tools/tagtool/tagtool.sh --rename-tag "$oldTag" "$newTag" done } ```
Autors
Biedri

The bigger problem to me seems to be /finding/ these tags, though.
Of course one can manually search for them, but that seems overly tedious.

Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags?

In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly.

Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...

The bigger problem to me seems to be /finding/ these tags, though. Of course one can manually search for them, but that seems overly tedious. Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags? In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly. Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...
Biedri

Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags?

Actually I do think that this is an excellent idea.

In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly.

I agree that for past news and events, this script would lead to many warnings, but actually I think that all of these warnings would be legitimate and should be fixed.

Let's face it: we have a mess, and we won't get rid of that mess without investing some work :-/

It's a lot of work, BUT: it doesn't need deep knowledge of the build or tagging system, so we might find some people to help with this. I'd be ready to cover my share of the cleanup.

Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...

Another excellent idea.

> Ideally, it would be nice to have a script to find candidate tags. Maybe we could identify suspicious tags by comparing the tags of each document to the other translations and issue a warning when there's a discrepancy regarding the tags? Actually I do think that this is an excellent idea. > In the past, this would probably have lead to way too many warnings, but I guess the situation has improved, and especially with the new tag syntax it has become so much easier to write and translate tags correctly. I agree that for past news and events, this script would lead to many warnings, but actually I think that all of these warnings would be legitimate and should be fixed. Let's face it: we have a mess, and we won't get rid of that mess without investing some work :-/ It's a lot of work, BUT: it doesn't need deep knowledge of the build or tagging system, so we might find some people to help with this. I'd be ready to cover my share of the cleanup. > Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags... Another excellent idea.
Īpašnieks

Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags...

Actually, we already have a warning in the pre-commit hook if someone introduces a completely new tag: https://git.fsfe.org/FSFE/fsfe-website/src/branch/master/tools/githooks/pre-commit#L84-L109

For everything else, I agree to Reinhard here

> Maybe if such a script is written, we could even add it to the git commit hooks and warn authors/translators when they add/remove tags... Actually, we already have a warning in the pre-commit hook if someone introduces a completely new tag: https://git.fsfe.org/FSFE/fsfe-website/src/branch/master/tools/githooks/pre-commit#L84-L109 For everything else, I agree to Reinhard here
Biedri

I think the idea was that the pre-commit hook could warn if the tags in the translation differ from the tags in the original.

I think the idea was that the pre-commit hook could warn if the tags in the translation differ from the tags in the original.
Īpašnieks

It seems that we've fixed the wrong tags with the major cleanups in the recent months.

I will work on the pre-commit hook and see how we can avoid a mismatch.

It seems that we've fixed the wrong tags with the major cleanups in the recent months. I will work on the pre-commit hook and see how we can avoid a mismatch.
max.mehl piešķīra sev 2021-01-19 19:48:12 +00:00
max.mehl slēdza šo problēmu 2021-01-20 14:59:57 +00:00
Pierakstieties, lai pievienotos šai sarunai.
Nav atskaites punktu
Nav atbildīgo
3 dalībnieki
Paziņojumi
Izpildes termiņš
Datums līdz nav korekts. Izmantojiet formātu 'gggg-mm-dd'.

Izpildes termiņš nav uzstādīts.

Atkarības

Nav atkarību.

Atsaucas uz: FSFE/fsfe-website#954
No description provided.