refactor: use lxml instead of string templating in process_file. #5146

Merged
tobiasd merged 12 commits from feat/lxml into master 2025-07-28 13:06:34 +00:00
Member

Code looks much cleaner and easier to read, as well as being more performant.

Also prevents the construction of invalid xml.

Some benchmarks:

❯ hyperfine --runs 3 "uv run build --full "; git checkout master; hyperfine --runs 3 "uv run build --full "
Benchmark 1: uv run build --full
  Time (mean ± σ):     287.612 s ±  4.557 s    [User: 1613.681 s, System: 151.556 s]
  Range (min … max):   283.550 s … 292.539 s    3 runs
Switched to branch 'master'
Your branch is up to date with 'origin/master'.
Benchmark 1: uv run build --full
Time (mean ± σ):     357.248 s ±  5.578 s    [User: 2052.506 s, System: 163.057 s]
  Range (min … max):   350.810 s … 360.609 s    3 runs

So this reduces build times by about a minute, getting us down to just under 5 minutes for a full build on my machine.

Code looks much cleaner and easier to read, as well as being more performant. Also prevents the construction of invalid xml. Some benchmarks: ``` ❯ hyperfine --runs 3 "uv run build --full "; git checkout master; hyperfine --runs 3 "uv run build --full " Benchmark 1: uv run build --full Time (mean ± σ): 287.612 s ± 4.557 s [User: 1613.681 s, System: 151.556 s] Range (min … max): 283.550 s … 292.539 s 3 runs Switched to branch 'master' Your branch is up to date with 'origin/master'. Benchmark 1: uv run build --full Time (mean ± σ): 357.248 s ± 5.578 s [User: 2052.506 s, System: 163.057 s] Range (min … max): 350.810 s … 360.609 s 3 runs ``` So this reduces build times by about a minute, getting us down to just under 5 minutes for a full build on my machine.
delliott force-pushed feat/lxml from ac54912b71 to 85af74c12d 2025-07-13 10:51:27 +00:00 Compare
delliott reviewed 2025-07-13 10:53:38 +00:00
@@ -246,0 +211,4 @@
href.text,
flags=re.MULTILINE | re.IGNORECASE,
)
except AssertionError:
Author
Member

One feels intuitively that this should be a full failure. But we use the build process to output ics files, and so we instead just catch the error and log it.

One feels intuitively that this should be a full failure. But we use the build process to output ics files, and so we instead just catch the error and log it.
delliott requested review from sofiaritz 2025-07-13 10:54:37 +00:00
delliott force-pushed feat/lxml from a9b4613252 to 45115ef18b 2025-07-13 13:29:43 +00:00 Compare
delliott force-pushed feat/lxml from 45115ef18b to 78d3246d06 2025-07-13 13:35:57 +00:00 Compare
delliott requested review from tobiasd 2025-07-27 15:50:10 +00:00
delliott force-pushed feat/lxml from 78d3246d06 to 7b306a1423 2025-07-27 15:50:13 +00:00 Compare
Owner

There is a merge conflict now

There is a merge conflict now
delliott force-pushed feat/lxml from 7b306a1423 to 89fd795a1a 2025-07-28 12:55:08 +00:00 Compare
Author
Member

Right, fixed the merge conflict

Right, fixed the merge conflict
tobiasd merged commit c4b7f0f33c into master 2025-07-28 13:06:34 +00:00
tobiasd referenced this issue from a commit 2025-07-29 08:18:04 +00:00
Sign in to join this conversation.