I share your opinion of SAML, but I have to ask, as someone who has also implemented it in Golang: what gave you any confidence in an implementation backed by encoding/xml? It was to me immediately pretty obvious that DSIG and encoding/xml aren't a fit, if only because of encoding/xml's poor namespace support. There are other DSIG Golang libraries that use an etree-style interface for what I presume is the same reason.
Blog author here; Russell's implementation is backed by github.com/beevik/etree, but like you said, it's just an interface. The tokenizer is still encoding/xml.
Adding better support for namespaces and providing APIs compatible with dsig doesn't remove the underlying vulnerabilities.
Ugh. That's disappointing. I loathe SAML, but also think the right thing to do here is to make sure nobody uses encoding/xml as part of their SAML stack.
I don't know about that. libxml certainly doesn't round-trip XML documents in general (though I don't think it breaks namespaces at least), whether that breaks SAML or not I have no idea.
Anyway from tptacek's other comments it looks like general-purpose XML libraries should not be assumed suitable for SAML, instead they should have purpose-built implementation for the SAML bit, then once the document has been properly validated and the SAML bits stripped off I guess that can be passed onto a general-purpose library:
> SAML libraries should include purpose-built, locked-down, SAML-only XMLDSIGs, and those XMLDSIGs should include purpose-built, stripped-down XMLs.
I would go out of my way to avoid libxmlsec1 and libxml. I honestly don't understand why it's so hard for a SAML implementation to just bring its own hardened stripped-down XML.
If I had to hazard a guess, bespoke implementation is usually recommended against, especially for complex formats. That it would be the best practice for saml does sound counter-intuitive.
This is like saying that variable name scoping is a semantic convention on top of the C language grammar and that a lexer can't really implement it. In the case of C, it turns out that the lexer must implement it. In the case of XML, processing name spaces directives during lexing is the right thing to do in nearly all cases. But it's not what these SAML libraries needed.