It occurred to the other other day that the .NET core is now open-source so I can actually go see the code involved in the Uri.TryCreate method.
It transpires that Uri is a partial class and the meat of the functionality is split over the two following files:
https://github.com/dotnet/corefx/blob/41e203011152581a6c65bb81ac44ec037140c1bb/src/System.Private.Uri/src/System/UriExt.cs
https://github.com/dotnet/corefx/blob/41e203011152581a6c65bb81ac44ec037140c1bb/src/System.Private.Uri/src/System/Uri.cs
The first thing that stuck me was there's an awful lot of code involved with attempting to create a Uri.
The TryCreate static methods which lives in the UriExt.cs class are deceptively simple. The TryCreate overload I'm interested in is the most simple of those - it really just passes the work off to a CreateHelper method which in turn passes off the work to a ParseScheme method located in the Uri.cs class.
ParseScheme appears to do some basic length checking before deferring to ParseSchemeCheckImplicitFile, which is where the main body of work seems to take place. As far as I can glean the following rules are being observed:
It transpires that Uri is a partial class and the meat of the functionality is split over the two following files:
https://github.com/dotnet/corefx/blob/41e203011152581a6c65bb81ac44ec037140c1bb/src/System.Private.Uri/src/System/UriExt.cs
https://github.com/dotnet/corefx/blob/41e203011152581a6c65bb81ac44ec037140c1bb/src/System.Private.Uri/src/System/Uri.cs
The first thing that stuck me was there's an awful lot of code involved with attempting to create a Uri.
The TryCreate static methods which lives in the UriExt.cs class are deceptively simple. The TryCreate overload I'm interested in is the most simple of those - it really just passes the work off to a CreateHelper method which in turn passes off the work to a ParseScheme method located in the Uri.cs class.
ParseScheme appears to do some basic length checking before deferring to ParseSchemeCheckImplicitFile, which is where the main body of work seems to take place. As far as I can glean the following rules are being observed:
- Whitespaces at the start are ignored
- A url is valid if it is at least 2 characters, as long as the first of which is not a number, followed by a colon - unless those letters are a scheme you'd recognise (ftp, http, https, etc) at which intuitive validation kicks in.
- UNC paths are valid e.g. //foo
Given these rules, the following odd strings pass as valid absolute URIs:
"aa:"
"fo:o"
" fo:o"
"javascript:void()"
Maybe there are just so many esoteric schemes out there that robust validation is not viable.
No comments:
Post a Comment