Nov 13, 2011

Ruby on Rails upgrade from 2.3 to 3.1

Recently I've performed the upgrade of Ruby on Rails which we use in Checkvist project from 2.3 to 3.1. There were tons of issues I've had to overcome, some of them I share here, may be they could save someone's time.

 First, upgrade or not upgrade? This is a tough question. Upgrading of a non-trivial project may take several days, and at the end you may get a system which performs worse than before. So you'd better have a good set of system tests, including performance tests.

 In my case, I used Rails 2.3.11 with my performance-related patch for using it with ruby 1.9.2, and I aimed at migration to Rails 3.1.1 (latest at the time of migration). I really like the assets pipeline feature introduced in the 3.1.x version of Rails, and I wanted to start using ruby 1.9.3. After all, new features, plugins, enhancements are developed against latest version of RoR.

At the time of writing  Checkvist is running RoR 3.1.2, and I'm pretty comfortable with its performance.

And here is a list of notes I made during the upgrade (the more complicated is your project, the more unexpected the issues):

General upgrade notes

  • Upgrade from 2.3 to 3.1 consists, actually, of 2 parts: upgrade to 3.0 and upgrade from 3.0 to 3.1. Some deprecations, which were allowed in 3.0, are obsolete in 3.1 (for instance, you must use the new routes syntax in your project).
  • One step which you can do before "breaking everything" during the upgrade - migrate your 2.3 project to use bundler. See the bundler page for the instructions - it is not that painful. Do not forget to update your Capistrano deployment script to handle bundler.
  • In my case, Checkvist was already compatible with ruby 1.9.2, so I used this version of ruby during the upgrade, and after that, updated ruby to 1.9.3
  • All mentiones of RAILS_ROOT, RAILS_ENV should be replaced with Rails.root and Rails.env, respectively. The same applies to Rails.logger .
  • uninitialized constant Syck::Syck problem should be resolved by running gem update --system
  • Rework all old-style ActionMailers to use the new syntax for sending mails - old syntax doesn't work in Rails 3.0 anymore. Also, check naming of your e-mail templates.
  • All files with extension .rhtml should be renamed to .html.erb

Upgrading to 3.0

    • For upgrade steps, I followed instructions from railscast, and used rails upgrade plugin
    • Upgrade of the routes.rb file may be non-trivial. Having tests for routes makes this process less painful. I don't recommend using automatic converter - it is better to understand what's going on there and make your route rules cleaner.
    • Update/rewrite of environment.rb file is rather simple: old stuff goes mostly to application.rb + session_store.rb; non-typical mime types are specified in mime_types.rb .
    • Gemfile: most gems have new, updated version for Rails 3+, no problems encountered here. All issues described in the early migration tutorials with will_paginate or thinking-sphinx gems are already resolved. 
    • After the update, I've removed some obsolete plugins and initializers (and some monkey-patches for rails 2.3 code). May be you have to review them as well :)
    • I had to add 'dynamic_form' gem to enable old-style helpers for error_message_on and others
    • Ruby files from 'lib' subdirectory were invisible until I uncommented "config.autoload_paths += %W(#{config.root}/lib)" line in application.rb
    • I didn't have lots of link_to_remote and remote_form_for helpers (which disappeared in 3.0), so I updated the code explicitly, using my own AJAX helpers in the corresponding places.
    • Cucumber integration tests. This part was tricky, because I used Cucumber with usual Rails integration tests (no webrat, no capibara). I needed POST and PUT requests. And support for such direct integration was broken in the newer version of Cucumber. But, you can use Cucumber directly with Rack::Test, and get the application response from last_response method of Rack::Test - and this was sufficient to modify my tests accordingly.

    Upgrading to 3.1

    • The path for upgrading to 3.1 was described in this post, but in my case the switch to using assets was not simple. 
    • I didn't use a single application.js and application.css files for the whole application (because in this case, after each update all .js and .css code would be downloaded by users, no caches). I've created several files, more specific, per page/type of page. To accomplish this, section config.assets.precompile of environments/production.rb  should include all paths for all .js and .css files used by your application
    • To deliver asset files in production, you have to change Nginx configuration as described in this great guide

    Performance notes

    • To get a decent app performance, comparable with 2.3.x, use ruby 1.9.3 and  specify garbage collection parameters.
    • It is a bad idea to override often-used attribute method in your model (for instance, to provide a default value). In Rails 3 such code works damn inefficiently - it doesn't create effective accessors for such methods
    • Do not use changes method to get information about which attributes changed - use changed_attributes for that

      So much for the upgrade stories. Will be very glad if you find some of these notes useful in your riding on Rails.

      Jun 13, 2011

      Prototype 1.7 memory leak

      Lately, I've been trying to fix a memory leak in TeamCity. After a long investigation, I found out that DOM elements on the page remains in memory even after a simple construct like:
        element.on("click", Prototype.emptyFunction).stop();

      This code adds a fake event listener on a element and immediately removes it (all using Prototype javascript library).

      I.e., after executing the code and removing the element from the page, it remained in browser memory.
      (BTW, I used Google Chrome memory profiler to find out hanging elements).

      So, the leak was caused by the Prototype's code, and I found out this nasty line:
          CACHE.push(element);

      Ironically, Prototype tries to avoid memory leak in IE by removing all event handlers on page unload, and for that, keeps a collection of the elements in the CACHE array. But, it never removes elements from the array (only on page unload), even when all event handlers are removed from an element. For any heavy ajax application, when elements are added/removed/replaced on the fly, this may be very unpleasant.

      I've fixed the problem in my Prototype fork on GitHub, and added a pull request to Prototype. Hope, it will save someone some hairs.

      You can also download patched prototype.js with the fix.

      Update: my original fix introduced a performance problem in stopObserving method (when many elements were observed on the page). Now, the problem is fixed, all related resources above were updated.

      Feb 13, 2011

      Checkvist downtime postmortem

      Checkvist service was unavailable since Feb, 13 04:13 UTC till 07:40 UTC.

      The total downtime was 3 hours 27 minutes. Users couldn't see/modify their information, but no data corruption occurred.

      We are sorry to everyone who was unable to access their data during that time. We've already taken some measures to prevent such problems in the future, see more details below.

      What went wrong

      • On Saturday, Feb 12, we've updated software on the production server (ruby, passenger, nginx).
      • At 04:13 UTC a daily server maintenance procedure was started. Some parts of the procedure were incompatible with the new installed software, and access to Checkvist was broken.
      • E-mail and SMS notification were sent to us. We didn't see e-mail notification because we were away from computers. The SMS notification was sent to an obsolete phone number :( So the problem was unnoticed until the morning, when we've checked e-mail.

      Problem resolution

      Measures to prevent similar problems in the future

      • Daily maintenance procedure was corrected
      • We've set another time for the daily maintenance, so it would take place in the daytime
      • We've updated our monitoring service. Existing phone number for SMS notifications was corrected, we also added another phone number for the notifications. We've also installed iPhone app which can notify about service failures.
      And finally, why all these explanations. I think it's better to be transparent and honest about the problems than make people guess about system's reliability.




      Jan 28, 2011

      Javascript error reporting in various browsers

      Again, about fighting with Javascript errors.

      I've just found out, that Opera browser in some situations provides the most detailed information about javascript errors. At least Opera 11 is rather good.

      Compare the error reporting in various browsers (all on Mac):

      FireFox 3.6.13+FireBug 1.6.1 - didn't report the problem, neither in FireBug console, nor in FireFox javascript errors window.

      Chrome 8.0.x
      gave this:





      And the winner, Opera 11:





      See the difference?

      Nov 5, 2010

      Internet Explorer AJAX errors debugging in Prototype

      This is a short story I want to put down for those who face the same problem.

      In Checkvist project, there is some not very trivial AJAX code. We use Prototype javascript library for AJAX handling. Prototype allows to specify error dispatcher for AJAX javascript errors, and as a fallback solution, I set window.alert() to report errors.

      Several days ago I got a bug report related to list deletion operation in Checkvist - Internet Explorer showed a couple of alert error dialogs. The problem was intermittent on the production server, and I couldn't reproduce it locally at all.

      Really nice and cool development tools in IE8 couldn't help me. I've had an exception object in my error handler, but there is no way to obtain stacktrace from it and find out where the real problem occurred.

      I tried to rethrow the caught exception so IE would show me what happened - no luck.

      After plenty of digging in Google I found a hint. IE can debug an error occurred in Javascript code only if it wasn't wrapped in try/catch block. So if you handle problems yourself - there is no way to catch the real problem in the Internet Explorer debugger.

      To solve the problem I had to remove all try/catch blocks from Prototype's AJAX code, put it to the staging server, spend another 10 minutes to reproduce the problem, and - voila! - got an invitation dialog from IE8 to debug the uncaught error.

      That was it. The error was fixed, Prototype was reverted to original state, code put to the production.

      If you know another solution for the problem - please comment, I'd be glad to know.


      May 16, 2010

      Feature discoverability

      One of the great approaches to developing usable software is to actually use the product you're working on. It is often called "eating your own dog food". The benefits are obvious - you have clear source of the requirements and priorities - because you're your own customer. A lot of good books for entrepreneurs like "The Art of the Start" or "Rework" promote this approach.

      But there is a trap.

      While adding features you mostly think of their usefulness, speed, productivity. You ponder how you'd use them in the most efficient way. And this sounds good, right?

      In this situation you're thinking as an expert user of your product. As a user who knows the product perfectly well from its day one.

      Features designed for experts are usually hidden. Keyboard shortcuts, advanced interface settings, non-trivial mouse gestures, cool rich functions hidden deeply in the UI. And you, as an expert of your product, might not expose these features enough (especially if you care for UI simplicity and don't want the interface to be bloated).

      And even when you're adding a non-expert feature, you can still hide it somewhere in the interface and your users won't find it. You definitely will - you know where it is. But other people can be unaware of its existence at all, or (worse) miss it even after searching for the feature.

      The most relevant recipe to detect such problems early is to perform regular usability testing, especially for the new functionality. Unfortunately, most software development teams don't do this, though even a simple 5 second test may help.

      Here are some other steps to ensure feature discoverability:
      • Provide a logical predictable navigation system in the application, either in the menu or some other way. Many users use the menu as a reference for the application features.
      • Give users a possibility to search for a particular function, either in the reference documentation or search for an action within the UI.
      • Make a screencast and/or blog post to inform your existing users about new cool feature.
      • In the application, create a "Fresh updates" section, which will inform existing users about new versions of the application, and which new features are available in it.
      • Create a "Hint area" in the application, which shows some relevant actions for the current situation.
      • If your application perform some lengthy operations, you may show users some "Tip of the day" screen while such operations in progress.
      • Prepare a printable cheat sheet with the application's keyboard shortcuts, if you provide them.
      We're facing the discoverability issues both in TeamCity and in Checkvist, and going to deal with them in the nearest future, so your feedback is appreciated.
      How do you help your users to find new or advanced features in your application?

      Apr 21, 2010

      What I like about TeamCity 5.1

      Hello,

      I've decided to sum up the most interesting (from my personal point of view) new stuff in TeamCity 5.1 release. I started writing this post in text, but decided that a form of outline is more suitable for that. So here is the outline I've prepared using Checkvist:




      So please, grab it and use! TeamCity Professional Edition is free. TeamCity Enterprise for OS projects is also free.

      And there is 60 days evaluation license for TeamCity Enterprise (you can use this license to upgrade your existing TeamCity Pro installation).

      Enjoy your builds :)

      Mar 12, 2010

      Memory leak fix in Scriptaculos Autocompleter

      The latest released version of scriptaculos (1.8.3) has a really old memory leak. In short, Autocompleter creates a lot of event handlers and never removes them. Given that Checkvist will use tag autocompletion rather intensively, I've decided to fix this problem.

      My solution is attached to the issue at the lighthouse and also available in my fork of scriptaculos.

      May be this fix will be helpful to someone else :)

      Mar 8, 2010

      Calculating the cursor position in textarea with JavaScript

      I've been spending some time writing tag support in Checkvist, and decided to share a bit of related JavaScript code.

      The idea is to allow adding tags with smart syntax: when you write "Call Bob regarding new furniture tomorrow #home" Checkvist will create a task "Call Bob regarding new furniture" with due tomorrow and with tag #home.

      The additional nicety could be the tag completion after the '#' character. But here comes a problem - how to show the completion popup near the cursor position in the textarea? The typical approach used in del.icio.us or GMail is to show the popup under the text field, but it doesn't look good when your cursor is far from the popup, in the beginning of a large text area.

      After some googling I didn't find an existing solution. The only helpful code was detection of the text cursor position relatively to the beginning of the string.

      To convert this position into pixel coordinates, one has to find out how the text is organized within the textarea, where linebreaks are, how many lines are there in the text, and so on. This, in turn, depends on the font metrics of the text and requires answer for the question "What is the length of the given string if it is placed into this textarea?".

      The answer for the last question can be obtained if you create a div with the same font metrics as the original textfield, give it absolute positioning, put a string into it, and take its width. Using such a function, you can model text wrapping in the textarea, and find out the actual X,Y coordinates of the cursor in the textarea.

      You can find my implementation of this approach on GitHub (BTW, a really great place to share open-source code). This library was tested with FF3, Chrome, IE, Opera, Safari. There may be some glitches, but they are rare and I'm pretty comfortable with the results so far. There is no need for Prototype, jQuery or other Javascript libraries to use the code.

      I hope this code will be useful to someone else, and if so, please drop me a line :)

      Aug 22, 2009

      Checkvist Pro plan

      After a long silence, we've finally released Checkvist Pro plan. The detailed post about this is available.

      With this release, I've also upgraded Rails to the version 2.3.3 and migrated Mysql database from MyISAM to InnoDB engine (because we've had problems with data consistency when transactions were interrupted).

      So far so good, performance haven't become worse. And I'm pretty satisfied with upgrade of the app engine from mongrel cluster to Nginx Passenger.

      Now will focus on the daily work and on adding more features to Checkvist.

      Apr 12, 2009

      Migration of Checkvist to Rails 2.3

      I decided to migrate Checkvist to new and fresh Rails 2.3 (or 2.3.2, to be more specific).

      I cannot say I really need features from 2.3, but I think it worth using the latest release (especially given that I already had to patch my Rails 2.2.2 installation to remove some bugs from it).

      As usual, the migration turns to be an adventure:

      • Change all tests which extend Test::Unit::TestCase and use fixtures with ActiveSupport::TestCase
      • Many test failures were fixed by adding self.use_transactional_fixtures = false to test_helper.rb
      • Fixed tests with cookies (see a good post with the explanation)
      • Faced a bug with cookie escaping in tests. This bug is not fixed in 2.3.2, so some tests are failing unless edge 2.3.x is used.
      • DEPRECATION: formatted_ paths had to be replaced. Replaced.
      • DEPRECATION: session.delete => session.clear
      • DEPRECATION: session.session_id => request.session_options[:id] (faced another already fixed bug)

      Some of these issues are described in Rails 2.3 change log. And some (like bugs ;) are not.

      So now, tests are successful again (for the latest 2.3 branch). But I'm not going to update production yet, will do some more testing. And, probably, wait for 2.3.3.

      Mar 18, 2009

      Monit start/stop problem from the command line

      Monit is a great tool to monitor various UNIX-like services and to take appropriate actions when they fail. And I've been using it with a great success to monitor Checkvist server. But, as with any tool, there are some issues.

      I've spent some noticeable time trying to figure out why monit's command line actions do not work. All of the start, stop, restart actions did nothing. Monit's log didn't contain any trace of my attempts to change state of services.

      After some browsing and googling and reading docs I've found the reason of the problem: I've enabled HTTP access to monit only in read-only mode, like this:
      set httpd port 3111
      allow user:user read-only

      And that was the problem. For command line commands to work, the first 'allow' line must not contain 'read-only' option!
      This is written in the Monit docs, but not in FAQ:

      NB! a Monit client will use the first username:password pair in an allow list and you should not define the first user as a read-only user. If you do, Monit console commands will not work.
      ...
      If the Monit command line interface is being used, at least one cleartext password is necessary. Otherwise, the Monit command line interface will not be able to connect to the Monit daemon server.

      So the solution was to change HTTP settings to:
      set httpd port 3111
      allow user123:user321 # actually, here goes really cryptic credentials, I don't use them anyway
      allow user:pwd read-only


      I hope this will save some time for those having the same problem.

      Mar 13, 2009

      JetBrains gone twitting

      Recently I've created a twitter account with basic purpose to provide some support for JetBrains TeamCity and Checkvist.

      Many JetBrainers actively twit and provide product support as well.

      Today, you may find twitter account for most JetBrains products:
      These accounts a pretty new, but in the long run you'll find a lot of interesting stuff in these feeds, I'm sure.
      So follow to stay tuned!

      Update: dotTrace account has been created as well :)

      Jan 9, 2009

      Standalone Windows Mobile emulator setup

      I've spent several hours trying to setup and run windows mobile emulator (we're working on mobile UI for Checkvist), and here is the essence of my experience:
      • Don't try to run this emulator under Parallels / Mac OS if you need to get a working network under the emulator. This is due to the fact that emulator uses Virtual PC functionality for setting up network, and it is not compatible with Parallels networking. In fact, I spent most of the time trying to overcome this problem, and failed :(
      • Make sure your windows has .NET Framework 2.0 installed (this may be not the case for Win XP) 
      • As mentioned above, you also need Virtual PC 2007
      • Install the standalone Windows Mobile Emulator
      • Install and reinstall second time (repair) the Windows Mobile Emulator Images. I used 6.0 ones, but there is 6.1 version as well.
      • Use these instructions to setup network in the emulator.



      nginx, Analog stats, LOGFORMAT

      Currently, I'm using nginx web server for the Checkvist project. And have the following definition for the access log format (from The Rails Way book):
          log_format  main  '$remote_addr - $remote_user [$time_local] $request '
      '"$status" $body_bytes_sent "$http_referer" '
      '"$http_user_agent" "$http_x_forwarded_for"';

      Today I decided to create a quick and dirty statistics report for the checkvist.com using Analog. The problem was, that the log format used by nginx is not recognizable by Analog by default - one have to tweak LOGFORMAT option.

      If you are in the same situation, here is the LOGFORMAT I use:
      LOGFORMAT (%S - %u [%d/%M/%Y:%h:%n:%j %j] %j %r %j "%c" %b "%f" "%B" %j)
      May be this will help someone :).