Improve XML paging serialization performance#3690
Conversation
Add profiling scripts and baseline measurements for XML paging while reducing overhead in the XML response writer path.
Precompute tag fragments, streamline repeated field writing, avoid unnecessary sequence allocation and use a faster XML escaping helper in the hot path.
Add a fast path for common search Bundle.entry shapes to avoid generic XML field iteration for fullUrl, resource and search metadata.
Introduce XmlDirectWriter and dispatch selected FHIR complex types through direct XML serialization with regression coverage for Period values.
Add XmlUtf8Writer and route XML serialization through it so primitive XML output can avoid extra character encoding work.
Extend direct XML serialization to additional Period, Coding, Identifier and Meta cases, and optimize the UTF-8 writer for common ASCII output paths.
Capture the optimization context, verification results, benchmark observations and discarded experiments for follow-up review.
alexanderkiel
left a comment
There was a problem hiding this comment.
I don't like the approach with the low-level XmlUtf8Writer.
Benchmark a clean Woodstox XMLStreamWriter emit path against this branch. This is the real experiment worth running. Drop both XmlUtf8Writer and the hand-rolled write-xml-element, drive WstxOutputFactory directly from your xml-handlers/unform-xml walk (no data.xml Element tree, no emit). That removes the intermediate tree that made main slow, keeps a maintained library doing escaping/encoding, and deletes ~200 lines of hand-rolled byte-twiddling + the duplicated escaping now living in both Java (XmlUtf8Writer.writeEscaped) and Clojure (write-xml-str). If it's within a few percent, the maintenance win is worth it.
| import java.io.OutputStream; | ||
| import java.io.Writer; | ||
|
|
||
| public final class XmlUtf8Writer extends Writer { |
There was a problem hiding this comment.
This class is a bit to low-level.
Summary
This draft PR improves FHIR XML paging serialization performance by reducing generic XML writer overhead and adding targeted direct XML writers for common FHIR structures.
The full original WIP history is preserved in the fork branch
codex/xml-direct-writer-wip. This branch is the review-oriented version with the same final code changes split into larger logical commits.Commit Structure
Optimize XML response outputprofiling/xml-paging.Reduce generic XML writer overheadSpecialize XML search bundle entriesBundle.entryshapes.Add direct XML writers for common complex typesXmlDirectWriterand covers first common complex types with regression tests.Write XML through a UTF-8 writerXmlUtf8Writerand routes XML output through the UTF-8 writer path.Expand direct XML writing for common valuesDocument XML optimization handoverVerification
Run on
codex/xml-direct-writer-review:Additional jar check:
Local Benchmark
Three local Blaze instances were used with the same dataset. The PR instance was started from a freshly built image of this branch and used a copied Docker volume from the existing optimized Blaze instance, so the resource counts match exactly:
Comparison below is XML output, median of runs 2-8:
A direct comparison between the fresh PR instance on
8090and the existing optimized WIP instance on8080showed essentially identical timings, with identical XML byte sizes and only about 1-2% variance.Notes
This is intentionally a draft PR. The implementation is performance-oriented and should be reviewed especially for maintainability of the XML writer fast paths and whether the profiling artifacts should stay in the final PR.
Default Page Size Benchmark
The same local setup was also benchmarked without an explicit
_countparameter. Blaze therefore used its default page size of 50. XML output, median of runs 6-30:The speedup is smaller than with
_count=1000or_count=5000, because fixed request, DB, and HTTP overhead account for a larger share of total time at 50 resources per page.Full Paging Download Benchmark
The following benchmark follows all
nextlinks until the complete resource type has been downloaded. XML compares this PR on port8090against the old baseline on port8081. JSON was measured only on the PR instance. Values are medians of three full downloads._countAt the default page size of 50, fixed paging and HTTP overhead dominates, so the XML speedup is modest. With larger pages, XML serialization becomes a larger share of total runtime and the PR shows substantially larger gains. On this dataset, PR XML is also slightly faster than PR JSON for the measured full downloads.