The Wild Wild Web:

Information Security from a Developer's Point of View

Node and PHP are both powerful server-side languages. They are also ones which leave security almost entirely up to the developer. In other words, as a developer you're not handcuffed, but you're also starkly exposed unless you write a suit of armour into each of your scripts.

No website can be totally secure, but protecting a site means you'll avoid most headaches that you'll almost certainly face if your site remains unprotected. Here I give some of the precautions that safeguard the database-driven CMS I wrote for this website.

Also, these are some definitions of terms that should be useful as a sort of InfoSec 101 for developers. This is a bit oversimplified, but hopefully I have made up for that with clarity. For security beginners this should also serve as an inroad into reading texts like Padraic Brady's Survive The Deep End: PHP Security or Karl Düüna's Secure Your Node.js Web Application.

Some Of This Website's Protections For Current Or Future Inputs

This website is built with a custom CMS I wrote in PHP. Here are a couple of things I did to protect it (there are many more than this, such as sending the appropriate HTTP headers, anti SQL injection code, sanitization of all inputs from the admin backend within the class which has the methods to receive said input, etc.)

Cross-Site Scripting (XSS). XSS is a security threat that allows an attacker to inject any JavaScript they please into your displayed webpages. If allowed through, this gives the attacker carte blanche to manipulate your website any way they see fit, including manipulation of the DOM, stealing users' personally identifiable information, sending Ajax requests to other sites and a whole host of other nasties.

I took my inspiration from Edward Z. Yang, the creator of HTMLPurifier (a poetic, elegantly written piece of code). For my CMS I decided to be more draconian than him however: I coded an HTML reparser that takes the HTML you put in, turns it into a tree, then strips anything that has an ounce of potential to remotely come close to being a threat. It then returns the clean HTML to your application.

This clean HTML is used with any code in the CMS which is dynamically served to the frontend from the server side.

The reparser itself eliminates '<script>…</script>' and other tags, re-arranges the final product according to the rules of inline vs block nodes, etc. If the HTML is so bad that it can't find a way to clean it up completely, it returns an empty string — but this usually doesn't happen unless the input actively tries to be malicious, or has very poor syntax.

In my code for anti-XSS, I found it easier to parse the initial HTML with PHP's 'DOMDocument' class, then convert this in my own reparser into a Left Child, Right Sibling (LCRS) binary tree of my own design. After manipulating the tree where necessary, my code returns the final product. All of this was unit-tested all the way.

PHP Sessions, Put Into A Database. HTTP — the backbone protocol of the web — is a stateless technology. This means the server has no memory from one client-side request to the next, and vice-versa. PHP includes sessions to make up for this fact. Sessions provide a way of storing any data you want into the server's filesystem, or for the more adventurous, into a database. Then you can re-use this data in any normally memoryless page you please.

It turns out that storing sessions in a DB is quite a bit more complicated than any websites I've seen will tell you (such as this one (2004), or this one here (2018), both excellent articles in themselves).

This is mostly because of PHP's internal session quirks. The reader will have to go down that rabbit hole on their own if they wish to, but essentially I've ensured safe, perhaps even overly-paranoid, PHP sessions in the DB. This means any website using this CMS on a shared hosting plan can rest easy: your sessions are no longer shared.

Cross-Site Request Forgery (CSRF). A user logs into a site. The site trusts that user (technically, the website's server trusts that user's browser). If a completely separate ('cross') domain ('site') sends an unauthorized request through that user's browser — the current site thinks that request is coming from a trusted user, when in fact it is not — then that's a forged request.

I have created multiple tokens in the administrative backend which check that the request is coming from this site itself, by invisibly sending a token each time the user does something to change states. If not, the backend saves all of that user's work (but deletes nothing), then logs them out.

Hack Me Ethically

I don't believe I've left any room for crackers to take down this site, besides the odd case e.g., 0-day vulnerabilities in PHP, etc. However if you manage to hack or crack my site, please do be ethical about it! Send the vulnerability to admin at mgraichy.com