Pedro Januário
The insane pursuit for challenges make me feel alive, achieving the goals make me seek for a new ones, and sharing knowledge is a pleasant railway and a lifetime challenge. Software engineering career is full of challenges and stumbles, learning with them is what make you go further.
NGINX SSL perfomance improvements with proper configurations
Performance improvements on a web application is much more than just tuning your application code, server and database configurations are also one of the best ways for it.
A couple of weeks ago when poodle was discovered, we needed to change some of our server configurations. To take care of poodle, disabling SSL v3 was enough, but since it was time for configuration changes I spent some time tuning our configs.
I had already noticed that our page load times were heavily affected by SSL, so the goal was:
To get rid of poodle;
Improve page load times by tuning our SSL configurations.
I have used the pingdom tools to check page load analytics, but there is a couple of decent tools for the effect.
After digging for a while on the internet and reading about nginx SSL performance improvements, I have found this awesome post with detailed information for a proper nginx security configuration.
Just by switching our ssl preferred ciphers and enabling the ssl session cache, the page load times were reduced in about 40%.
For us this was major quick win, with couple of hours we have made a performance improvement that was also noticed by our users.
Over the past two years I have been working a lot with ZeroMQ, while building ZeroMQ Service Suite (ZSS), and last Wednesday I was talking about ZeroMQ @require('lx').
The talk intent was to give an overview of ZeroMQ and how they could benefit with it. The slides are available here and at slideshare.
Since the talk was short and didn't include any demos or hands on, we are planning a dojo, to play around with ZeroMQ sockets.
Calculate ages: the 29 february issue, the leap year date
Today, I had come across with an error on one of our applications and it was a funny one.
If you have code that return the age based on a date field, give it a look and test for date 29-02-1996, it should return 18 years for today and you should also test it to for corner cases: a day before and after this day, to ensure that is properly implemented.
Why? Because it's a leap day!
http://en.wikipedia.org/wiki/Leap_year
Here some ruby code and specs for proper age calculation.
Some of my builds started to fail on travis, and apparently the last package of [email protected] is broken and since jasmine-node dependency is >=0.2.0 the last version is loaded.
So in order to fix this until a new version arrive you can install jasmine-reporters >=0.4.1 as a dev dependency and npm will not load 2.0.0, since the previous one already meet the needs.
While developing I hate to attach my applications to specific libs, something that I am missing while developing in node is good logger facade. At least I didn't find any! :)
So with that in mind I developed a simple logger facade that can be used as contract for async logging on applications.
Than we just need to attach plugins of our favourite log repositories and voila, we are logging to multiple repositories asynchronously.
This can be used to log information to console/file while in development, production errors to airbrake and production logs to centralized repository on elasticsearch.
In the next weeks I will be implementing a plugin for airbrake and elasticsearch.
Feel free to get in touch! ;)
UPDATE: The build is broken due to a change in the readme! :P Now seriously, it seems that jasmine-node have bug that is causing the build to fail on travis. I wiil take a look on that later! :D But it passes on my machine! :P
UPDATE 2: [email protected] is broken, install version @0.4.1 as dependency solves the issue.
Setup a node.js project with CI on travis and code coverage on codeclimate
Starting new Node.js project, during this process I have faced some difficulties to do what i wanted, so I have prepared a skeleton project and a step by step guide, that could help others and be used on my next projects.
Requirements for project setup:
automatic builds
run tests
generate coverage reports
automated code review
Other requirements:
tests should run successfully to execute commits
coverage limits should be checked to successfully push code
Open-source projects can use some powerful tools for free, and i will use some of them, they also have payed accounts for private projects. Let's present them.
Tools
grunt: Javascript task runner, to help automate your tasks;
jasmine-node: Package to use Jasmine, BDD test framework, in node-js;
istanbul: Javascript coverage tool;
jshint: Detect errors and potential problems on javascript code;
travis: Continuous integration made easy;
codeclimate: Automated code review tool and coverage reporting;
gemnasium: Keep your projects in shape and updated, this tools will help you managing your dependencies and track them;
Steps
create github repo with readme and clone it, how to;
register or signup in travis-ci.org
link github repo on travis, help;
register or signup in codeclimate
link github repo on codeclimate
add grunt file
add .travis.yml
encrypt your codeclimate token;
push to master;
build should pass;
add some module and specs;
execute grunt test and check they are passing and check coverage report;
push to master and check codeclimate information;
Steps 1 to 5 are straight forward, so I will skip them, and go directly to our grunt tasks.
5. Set your tasks with grunt
Add a file name Gruntfile.js to your base directory of the project, here you can get a sample file or get the final file from the project we are building.
JSHINT
- Install jshint contrib package with
npm install grunt-contrib-jshint --save
- Set jshint configuration, filte match patterns and options to jshint;
On jasmine configurations we are using spec files with _spec.js pattern.
On coverage options, the fail task is set to true to force task to fail when coverage thresholds aren't met. In this configuration we can see that we are aiming 100% coverage on the four metrics.
This npm package uses Istanbul to instrument code and generate code coverage reports. This framework generates pretty decent reports, but for me they have really big issue.
I had write about this before in this post. When we have a file with code that isn't required and without specs the coverage report is 100% because the framework doesn't reach this code.
To avoid this issue I have created a custom grunt task that make all files to be instrumented and coverage to be correctly generated.
Add grunt test task
This task will run jshint, instrumentation custom task, env task, tests with coverage.
This are the base skeleton of our gruntfile, I also added;
grunt-contrib-watch: useful for development;
jasmine_node:unit and :integration tasks for running unit and integration tests separately;
grunt-env: this will allow setting environment variable for different envs, we are using on grunt to set test env;
6. Set configuration file for Travis
Add file .travis.yml to your base directory, feel free to copy my sample. This file contains an after_success hook that will push your coverage report to code climate. I will cover codeclimate configuration on the next step.
7. Push coverage to code climate
After build run successfully we need to push coverage report to codeclimate, to execute this step we need to install:
npm install codeclimate-test-reporter --save-dev
Now we need to set your codeclimate token on .travis.yml, to accomplish this go to your codeclimate repository settings and encrypt it. This post shows you one way to do it, so i will not cover this part.
Now after this step you can push your code to github and wait to check the build status. The next step are just to add specs and run tests and check coverage execution and publishing and I think doesn't need any more information.
You have the entire code on github and feel free to fork it and play with it. The links to travis, codeclimate and gemnasium are also available on the badges.
Istanbul code coverage force instrumentation of all files
The Istambul code coverage tool is really nice but, it shows you the coverage of your specs instead of your code.
I want to be able to say that my code have 100% coverage, instead of saying that i have 100% coverage from my specs.
The Istanbul instrument all files that required by your specs and generate code coverage reports, with pretty decent reports.
When we have a file with code thats isn't required and without specs the coverage report is 100% because the framework doesn't reach this code.
To avoid this issue I have created a custom grunt task that will create a spec file without tests but requiring all js files from the project, excluding specs. This will make all files to be instrumented and coverage to be correctly generated.
grunt.registerTask('gen-instrumentation-file', function() { if(fs.existsSync(instrumentationFilePath)){ // remove file if exists fs.unlinkSync(instrumentationFilePath); } // use {'flags': 'a'} to append and {'flags': 'w'} to erase and write a new file var file = fs.createWriteStream(instrumentationFilePath, {'flags': 'a'}); grunt.log.writeln('generating instrumentation file: %s', instrumentationFilePath); var srcPath = './'; grunt.log.writeln("Source Path to walk: %s", srcPath); var specMatcher = grunt.config.data.jasmine_node.options.specNameMatcher; var filecheck = function(path){ var isModule = path.indexOf('node_modules') === 0; var isCoverage = path.indexOf('coverage') === 0; var isGruntfile = path === 'Gruntfile.js'; var isSpec = path.indexOf(specMatcher) !== -1; if(isModule || isCoverage || isGruntfile || isSpec) { return; } grunt.log.writeln("require file ./%s", path); file.write('require("./' + path + '");\n'); }; fsTools.walkSync(srcPath, '.js$', function(path,stats,callback){ filecheck(path); }); grunt.log.ok('generated %s', instrumentationFilePath); });
NOTE: This custom task depends on fs-tools npm package.
Now this task can be added into test pipeline, before running your tests or coverage to force instrumentation of all code.
Over the past years i have been working with two amazing search tools, Solr (3 years ago) and last year with Elasticsearch.
So i would like to point some interesting articles about the two, to help you choose your tool. I will also show you some other useful tolls for Elasticsearch.
Both products are built on top of Lucene, an amazing tool, that works with inverted indexes and it's search capabilities are amazing.
So if you need to pick one of those, probably both will solve your needs and you don't need to worry too much. They are not performant, they are incredibly performant, and useful for problems such as search, recommendation, real time data, analytics and log analysis.
I will point two great articles with some comparisons about them, just to give you an idea of it.
Stackoverflow
Elastic vs Solr feature smack down
My personal choice is Elasticsearch, why?
It's more friendly;
It's really easy to integrate and you can choose any programming language;
it's nicer for distributed system and very easy to scale;
Percolators, are an awesome and useful feature;
it's surrounded by a set of amazing tools such as Logstash, Kibana and some other open source administration tools.
Recently I have been working on a recommendation system that is built using elastic and that lead me too explore some other features besides search and it totally rocks!
It's really versatile, performant and easy to explore. The Elasticserach DSL query language will probably your first headache when exploring more complex scenarios. The documentation is good but not very extensive about the DSL and you will probably feel the need for more information, but as soon as you understand the concepts and how you use it, it will be fairly easy.
Besides the main features of Elasticsearch you can also explore tools such as Logstash and Kibana. Logstash can let you search analyse your logs in really easy way and with Kibana you can built awesome reports.
Just give it a try and explore it! I am pretty sure that you gonna like it!!
Some other usefull tools and links for Elasticsearch:
BigDesk: Live charts and statistics;
Kopf: Web administration toll;
ElasticHQ: Monitoring, Administration and Query tool;
Last couple of days I needed to analyse some performance issues on a Rails application that I was working on.
I am quite newbie on ruby&rails world, since I only started working with rails, more deeply (not as playground :D ), in the last year. As already happen to me in the past, when I "switch" from one ecosystem to another, I always try to take a look for similar tools and good stuff that I am used to use on other ecosystems. It happen when I returned to .net after working with java for a couple of years, that by the way opened my eyes for a lot of stuff.
So when I am working on .NET ecosystem I always use Glimpse or MiniProfiler, both are amazing tools and a requirement for .net development. When i started looking for profiling tools for ruby, I had found that Sam Saffron started a ruby port for MiniProfiler and I give it a try. The port is awesome as I suspected. It details all database requests and times and also some extra client side profiling.
Another useful tool is peek that provide multiple peek views that show performance measures and counters, such as number and execution time of queries made to pg, memcache, mongo, redis and other kind of views.
It shows you bar like the one bellow.
The peek bar is really nice to have quick snapshot of the request, the MiniProfiler is awesome to discover what is wrong.
We can even use MiniProfiler with Rails API, we can execute an API request and get the response header "X-MiniProfiler-Ids" and use it on the share handler, that retrieve a full page with all the information.
/mini-profiler-resources/results?id=header-id
Using the previous approach would be fairly easy to read the headers from server logs and collect data, that could be used for statistics. I will probably try to do it with StatsD. Any tip for something like this? Give me some feedback.
Now, just go there and give it a try!
I think both tools are quite easy to configure and that's why I will not put any information about it here.
Just a side note, you will see a huge number of queries that retrieve table schemas and others, don't freak out. It's the way ActiveRecord check a lot of stuff. In production mode it is cached and in development on the second hit the queries, aren't executed.
On software industry the iterative development is one of the major changes on the industry in the last decade. This article, about Inditex group, was the main motivation to write about this success case.
Software industry learned from Toyota success and "ported" Lean/Kanban process into software development. This is one of the management methods widely used on several software engineering teams and for me is the best way to handle maintenance teams.
The inditex group founded by Amancio Ortega, is an amazing case study. They had made a revolution on apparel industry and they use similar approaches from the ones used on software industry. This change leads them into case study for Harvard University.
Inditex moved from the traditional way that clothes are design, produced and shipped. This change, lead them to be several steps ahead of their competitors. Instead of creating two or four collections a year, they design new pieces every two week, produce small chunks, ship them to the stores, analyze the customer acceptance and reajust them and the cycle restarts.
Inditex certainly have other factors that lead them to be on the top, but on my personal opinion i think this mind set shift and breaking the "business" rules is what make them so strong.
I think we have a lot to learn from other industries and with it we will probably find some smarter and profitable ways to solve or improve our workflow. Long time ago I had the pleasure to build a software for a production line that builds car parts, it was amazing to see how the business and employees benefit from it and I certainly learned a lot.
Improving workflows, is one of the things that i love on Software Engineering, and it's amazing the huge amount of thing that we can improve on several industries.
A long, long time ago i had write a post about an exercise i did. Unfortunately, I didn't had "time" to blog so often as I would like too! :)
There is, probably, several ways to solve it, so here it is my solution. I hope it help others to learn something and feel free to give me some feedback or suggested any improve.
Recently i come across with a codebase that were completed blotted with api keys and other settings in code, this was weird because already exists a settings file and it's fairly easy to use it in code. But anyway, was time to fix this.
While changing this mess i encountered what was probably the cause of this. We are using settings logic and the default usage is to create a settings file on models, since usually the files on models aren't configured on autoload they couldn't be used on initialisers.
For this reason, i started to search a solution for this and i come across with this awesome solution.
I think that might be useful for more people, so here it is and it's really easy.
Master thesis - Accessibility and Usability in information sistems
During the next months I will be occupied searching and writing for my master's thesis and to finally accomplish my master degree.
Last year, I started another master thesis in data mining and learned a lot about telecommunications industries, but unfortunately due to some difficulties accessing the real data, I wasn't able to accomplish my goal. For that reason and to avoid issues related with external factors, I decided to change the subject and study another interesting subject that I never had the opportunity to had a deep look into before.
As a Software Engineer I like the technical component in the same way that I appreciate and try to understand the business vision. So I see my self as a technical asset who can provide and add value into the business chain, at least that is always my main goal.
Accessibility and usability in information systems are related and my sight is that they are a major success factor of any product of any kind. So for me this knowledge will be an important asset either as:
consultant, sharing this knowledge with the customer and providing important feedback to the business;
entrepreneur, my goal in short/medium term is to follow my own track as an entrepreneur.
So keep tuned if you would appreciate to ear about this subject.
Configuration mistakes or issues that causes obscure security holes on IT systems.
Usually the system administrators have different privileges from database administrators and that should be a golden rule for every IT system.
Yesterday, i was trying to get access into an SQL Server installed in random a computer that i have access with an administration role. I didn't have any account credentials for SQL Server either SQL account or Windows Authentication. So i started to look for any way to gain access to SQL back.
The SQL is configured with mixed authentication mode and after a couple of minutes googling for a while and i found one interesting article about it.
The article describes a nice way to achieve access as sysadmin to one SQL Server installed on a machine where we have administrator privileges without restarting SQL server.
NOTE: This could be an awesome tool to save the day but can also be a huge security hole!!!