I'm writing up a kind of CMS, which lets the users and admins to create new content like news, blog posts through a WYSIWYG editor, now i'm facing a big doubt about what should be the process to manage the input data. Further than validate each field, what should i do to secure the user input, store it safely and output it in a secure way.
What kind of protection Fuel already has and what should i do to avoid any kind of hacking or security gap.
Fuel doesn't do much automatically, it is not our philosophy. We try to provide some sane defaults, but always want the developer to be in control, not the framework.
For example, most frameworks strip and filter on input (GET and POST variables). We don't, as it means you can never access the data exactly as it is posted. This is something you have to take into account, as it means that if a user posts something malicious, it will be stored in your database.
There are measures available to prevent that.
First is to make sure all your forms have CSRF protection. The Security and Form classes come with methods to help you with that. It will protect your forms from an XSS attack.
Always validate all input, and use validation rules that are as tight as possible. If the input must be a single word, don't accept whitespace. If it is a positive number, don't just check for numeric.
For fields that require HTML input (such as the result of a WYSIWYG editor) you can xssclean() it. Fuel uses the Htmlawed library to do that, which can be tuned very precisely to what HTML you want to allow or to filter.
This is very important, because unlike other frameworks, Fuel encodes ALL data send to a View by default. This means that even if some HTML (or worse, javascript) gets through, it will be harmless, encoding will make sure it will be echo'd out instead of being interpreted by the browser.
Obviously, for HTML input you don't want that, so when passing this field to the View, you have to disable encoding (tell the View the data is safe). Only do this on data you are sure is clean!
Both the DB Query builder and the ORM have measures to protect it from SQL injection. It doesn't matter if you just pass data to it, or use query binding.
It is strongly advised not to use DB::query() and hand-written SQL, but always use the query builder, to avoid SQL injection opportunities. So not:
DB::query("SELECT * FROM users WHERE username = '".$user."' and password = '".$pass."' LIMIT 1")->execute();
The first one is vulnerable to SQL injection, the second isn't. If you need something in your query that is not supported by the query builder, use DB::expr(). Note that depending on the expression it might open a SQL injection opportunity, so be careful what you use it for.
When it comes to passwords, both SimpleAuth and OrmAuth use the PBKDF2 algorithm to hash passwords. A per table salt is used, not a per-user. Altough there is a lot of debate about this, in general if a hacker has access to your database, (s)he not only has all password hashes, but also the per-user salts that belong to that. Since the code is open source, the way the password is hashed is known, rendering a per-user hash quite pointless.
So, the process that i should do in those cases are:
1. Set the CSRF Token and check it on each POST request (I'm not sure if it is possible or needed in a GET request)
2. Validate as well as posible the data send by the user
3. When using WYSYWYG editors, use xss_clean() on the fields that accepts html
3.1 Should i use also htmlentities() on these fields? or store the raw html after xss_clean() it??
4. To show the stored content, i could set encoding to false with no risk since after validation, xss_clean() the data should be safe, am i right?
5. I'm always using ORM models or in some cases DB Query Builder with no custom sql statements, so may i be sure enough to not make any additional checks or input verification regards SQL Injection?
You can have it checked globally (there's a security config key for that) but the problem is it happens so early in the loading of the framework that you can't properly capture it, which is needed if you want to give the user an error message (because usually it's not an attack, it's a user using the browsers back button).
We work around it by checking it in the before() method of our base controller, the controller that every other controller extends.
2. Yes.
3. Yes.
Note that this isn't easy. The default settings of Htmlawed work pretty well, but if you want really fine-grained checks, you'll have to tune it. You can see here how complex it can get: http://www.bioinformatics.org/phplabware/internal_utilities/htmLawed/htmLawed_README.htm You can pass any options for htmlawed via the second argument of xss_clean().
4. No, always leave encoding enabled!
When you send something to a View that should not be encoded, you see it quickly enough, since it would for example print the HTML, instead of rendering it. When you see something like that, check what it is, make sure it is intended, and that the data is validated, cleaned and can be trusted, and then pass that single variable to the View with encoding disabled, using the set_safe() method of the View class.
5. Correct.
ORM is protected against SQL injection, so you don't need to worry about that. The same is true for the DB query builder. Models (both ORM and Model_Crud based) do auto-encoding, meaning you can pass them to a View, and when the view requests a property, that will be automatically encoded. This means that if your model contains HTML, you should pass those fields seperately using set_safe(), as it's complex to disable model encoding for specific fields.