Add <email> tag to obfuscate emails/strings for better bot protection #2578

Merged
max.mehl merged 7 commits from obfuscate-email into master 2022-04-28 12:02:22 +00:00
Owner

Fixes #2484

This PR introduces a new tag <email> that can be used everywhere to obfuscate email addresses. Right now, if you just write me@example.com, even the dumbest bot that scrapes the website will find it and can spam you (or others).

There are numerous techniques that one could use to avoid that. Many meanwhile make use of Javascript and fancy CSS rules, but they have at least one major downside: accessibility for people using screen readers is destroyed.

Therefore, I went with a mixture of decimal and hexadecimal codes. So within the <email> tag, an a becomes &#x0061; (hex) while an x becomes &#120; (dec). Some common characters even are just left as before. From my understanding, this does not affect screen readers but it should make it harder for script kiddie bots to un-scramble it.

This is also usable in XSLT templates, most prominently on our team page. Commit 8d9d8d366e demonstrates this, while 80d6e6e367 shows the same in the simple form for XML files.

Furthermore, you can automatically make it a clickable link with mailto:. Such a tag may look like <email mailto="yes">me@example.com</email>. If you do not provide the mailto attribute or have another value than yes, no link will be shown.

This element could also be used for other things than email addresses, e.g. telephone numbers. However, since this is probably the most common use case, I kept it simple for editors.

Fixes #2484 This PR introduces a new tag `<email>` that can be used everywhere to obfuscate email addresses. Right now, if you just write `me@example.com`, even the dumbest bot that scrapes the website will find it and can spam you (or others). There are numerous techniques that one could use to avoid that. Many meanwhile make use of Javascript and fancy CSS rules, but they have at least one major downside: accessibility for people using screen readers is destroyed. Therefore, I went with a mixture of decimal and hexadecimal codes. So within the `<email>` tag, an `a` becomes `&#x0061;` (hex) while an `x` becomes `&#120;` (dec). Some common characters even are just left as before. From my understanding, this does not affect screen readers but it should make it harder for script kiddie bots to un-scramble it. This is also usable in XSLT templates, most prominently on our team page. Commit 8d9d8d366e9be5deb5b036f40991701fffacbbfd demonstrates this, while 80d6e6e367b135151b732fb6d9c79358d7131e07 shows the same in the simple form for XML files. Furthermore, you can automatically make it a clickable link with `mailto:`. Such a tag may look like `<email mailto="yes">me@example.com</email>`. If you do not provide the `mailto` attribute or have another value than `yes`, no link will be shown. This element could also be used for other things than email addresses, e.g. telephone numbers. However, since this is probably the most common use case, I kept it simple for editors.
max.mehl added the xsl label 2022-04-22 13:28:24 +00:00
max.mehl added 6 commits 2022-04-22 13:28:26 +00:00
max.mehl added 1 commit 2022-04-28 12:00:35 +00:00
document <email> tag in web features
continuous-integration/drone/pr Build is passing
4ccf2a1e17
max.mehl merged commit e52bf37519 into master 2022-04-28 12:02:22 +00:00
max.mehl deleted branch obfuscate-email 2022-04-28 12:02:23 +00:00
Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: FSFE/fsfe-website#2578