Forums » Chit-chat » does it even has bbcode

How many times does this get escaped.

This should be visible as simple html markup:


<ul>
<li>List item 1</li>
<li>List item 2</li>
<li>List item 3</li>
<li>List item 4</li>
</ul>


Do [url=javascript:alert('Sadly_javascript_is_still_a_valid_url_by_specification')]URLs get matched?[/url]

Does [url=http://beta.stepmania.com/forums/meta/show/javascript:alert('Sadly_javascript_is_still_a_valid_url_by_specification')]this get executed?[/url]


Do my accents get misencoded? árvíztűrő tükörfúrógép ÁRVÍZTŰRŐ TÜKÖRFÚRÓGÉP

Last edited: 22 January 2014 12:26am

Funny Picture" onload="console.log(document.cookie)
Reply
How many times does this get escaped.
too many

This should be visible as simple html markup:


<ul>
<li>List item 1</li>
<li>List item 2</li>
<li>List item 3</li>
<li>List item 4</li>
</ul>


Do [url=javascript:alert('Sadly_javascript_is_still_a_valid_url_by_specification')]URLs get matched?[/url]

hmmm

Do my accents get misencoded? árvíztűrő tükörfúrógép ÁRVÍZTŰRŐ TÜKÖRFÚRÓGÉP
so far so good
Reply
[url=javascript:alert("test")]js thingy that shouldn't pass validation[/url]
[url]javascript:alert("another_js_test")[/url]

this thing might come in handy, too:
_^(?:(?:https?|ftp)://)(?:\S+(?::\S*)?@)?(?:(?!10(?:\.\d{1,3}){3})(?!127(?:\.\d{1,3}){3})(?!169\.254(?:\.\d{1,3}){2})(?!192\.168(?:\.\d{1,3}){2})(?!172\.(?:1[6-9]|2\d|3[0-1])(?:\.\d{1,3}){2})(?:[1-9]\d?|1\d\d|2[01]\d|22[0-3])(?:\.(?:1?\d{1,2}|2[0-4]\d|25[0-5])){2}(?:\.(?:[1-9]\d?|1\d\d|2[0-4]\d|25[0-4]))|(?:(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)(?:\.(?:[a-z\x{00a1}-\x{ffff}0-9]+-?)*[a-z\x{00a1}-\x{ffff}0-9]+)*(?:\.(?:[a-z\x{00a1}-\x{ffff}]{2,})))(?::\d{2,5})?(?:/[^\s]*)?$_iuS

EDIT: they fail with single quotes, pass with double.

Last edited: 22 January 2014 12:57am

Reply
EDIT: they fail with single quotes, pass with double.
At least you can't bust out of attributes now, they are just recognised as proper double quotes by the javascript parser.

The encoded single quotes (&#039; or &apos;) would also work, but the URL validation stops it. (See this fiddle for reference.)

By the way you don't even need quotes for strings as String.fromCharCode() accepts as many parameters as you want. For example: [url]javascript:alert(String.fromCharCode(65,66,67))[/url] contains no special html chars, so no matter how quotes are handled,

There are 2 contexts the input from bbcodes can end up: Between tags and as attribute value.

If the context is between tags, a html encode prevents it from being anything else than plain text.
If the context is an attribute value, no encoding will help and produces expected results on proper inputs.

It seems basically anything that can end up as an attribute needs to be properly validated.

Last edited: 22 January 2014 5:42am

Funny Picture" onload="console.log(document.cookie)
Reply
Anything using the javascript scheme should be rejected now (it rejected them correctly in all my test cases at least).

So, how many holes can you poke in this approach?
Reply
I have yet to find any major flaws that works in my installed browsers, so at least attackers with minimal understanding on the subject (like me) will be stopped.

The most severe I could find is [url]vbscript:msgbox("XSS")[/url], that works in oldie, and in the "compatibility mode" of IE11 (the design looks fabulous there :D).

Some cheat sheets (this too and this too) and smoke tests include attack vectors inside src attributes like: [url]javas\0cript:alert("xss")[/code] , [url]javas\tcript:alert("xss")[/url] , javas&#0173;cript:alert("xss") and similar, but I haven't seen any of them actually work even if some of them get through as urls.

Last edited: 22 January 2014 1:56pm

Funny Picture" onload="console.log(document.cookie)
Reply
OWASP has an other excellent article about XSS prevention which I often turn to when in doubt. It is extremely educational and although not all parts apply to our problem, I would recommend carefully reading it to anyone planning to write anything that outputs untrusted user input ever.

There are a few points that apply to this bbcode model. Namely:

* RULE #1 - HTML Escape Before Inserting Untrusted Data into HTML Element Content.

This applies to anywhere. The only encoding we don't do is encoding / as &#x2F; . I'm not sure if there is an attack that could be performed with it currently but might worth to do it just to be safe.

* RULE #2 - Attribute Escape Before Inserting Untrusted Data into HTML Common Attributes

Although this sounds very applicable to our case it is not. "Properly quoted attributes can only be escaped with the corresponding quote". html encoding already does this.

Also since this rule only applies to common attributes where no script execution is possible, anything related to javascript, urls and css needs further care.

* RULE #4 - CSS Escape And Strictly Validate Before Inserting Untrusted Data into HTML Style Property Values

We currently have a color, a small and a large tag where CSS properties are set. They are all validated CssColorValidator. Might worth to look into it and see if it's enough. The article describes some basic attack vectors.

* RULE #6 - Sanitize HTML Markup with a Library Designed for the Job

If there are parts that cannot be escaped properly, feed the output through a sanitization library just to be sure.


There is no detailed part about safely handling full urls, but at the summary there are a few defense approaches mentioned: Cannonicalize input, URL Validation, Safe URL verification, Whitelist http and https URL's only, Attribute encoder. In our case this is the hardest of all.

Last edited: 22 January 2014 2:44pm

Funny Picture" onload="console.log(document.cookie)
Reply
I'll start putting the schemes in a whitelist when I get home, I guess. I'd like to keep HTML purifier around just as a last resort (as I said on the tracker).
Reply