Search function sometimes misses actual teaser #2472
Labels
No Label
bug
build
cgi Scripting
design
disruptive
documentation
duplicate
easy
feature-request
help wanted
javascript
priority/low
question
system-hackers
tagging
text
translations
wait/bugfix
wait/inprogress
wait/misc
wait/proofread
wontfix
xsl
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: FSFE/fsfe-website#2472
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The search index misses some results because the teaser takes the first
<p>
element which is often not the actual teaser.Example: This search does not show this page, but it should.
The entry of the spreadtheword page in the search index looks like the following:
{"url": "https://fsfe.org/contribute/spreadtheword.en.html", "tags": "", "title": "Spread the word", "teaser": "Contribute", "type": "page", "date": null}
"Contribute" is the first p element on this page:
Possible solution: make the search ignore these paragraphs, so also the second and third on this page which are also not actual teasers:
Thanks for reporting! I see two solutions:
As you rightfully said for now the indexer assumes that the first
<p>
is the teaser. Maybe we can narrow down the assumption to only specific subfoders or files? For example, we can probably assume that the first<p>
is always the teaser for news items.Identify teasers with a 'teaser' css class
Thanks. I lean towards 1. although I see the problem of excluding all these edge cases. 2. makes things for editors harder, and we already have a number of ids and classes for introductions and such.
@reinhard, what's your take on this?
I think the best solution would be to explicitly exclude some paragraphs based on id and/or class and then take the first paragraph of what is left.
In case we implement #1348, we could add a rule that if a paragraph is formatted with that
class="lead"
, then that's the teaser with highest priority.Sounds good. So we would tackle this from two sides:
Right?