Panasonic Youth rob sanheim writes about software, business, ruby, music, stuff and things



MacPorts Ruby performance gotcha

If you install Ruby from MacPorts (all the cool kids do), be aware that the first 1.8.7 port had some issues that basically broke performance, making it run three times slower than normal. Verify that you have the 1.8.7-p72_2 version active, and not 1.8.7-p72_1:

~ $ sudo port installed |grep ruby
ruby @1.8.7-p72_1+thread_hooks
ruby @1.8.7-p72_2+thread_hooks (active)

The defect on MacPorts Trac has more details, along with links to a relevant ruby-talk thread.

Its worth pointing out that this issue went from discovery to being fixed and released into the port stream in about a days time. MacPorts has really come a long way from just a year or two ago. Fairly major packages used to lag behind the latest stable releases by multiple versions, but now I find everything I care about stays up to date with ease from MacPorts. They are even able to keep up with git releases, an impressive task considering how fast and furious releases git releases come.

Kudos to the MacPorts team for doing a fine job and making developers’ lives easier.


Tarantula 0.0.5 Released - the “Naked Aardvark” release

Announcing version 0.0.5 of Tarantula.

Tarantula is a big fuzzy spider. It crawls your Rails application, fuzzing data to see what breaks. It can verify HTML validation across all your pages, ensure you don’t have 404s, and pretty much anything else you want via custom handlers.

Don’t let the version number fool you, we’ve been using Tarantula across many projects at Relevance and its very stable. This release fixed a number of annoying bugs, including namespace conflicts with other classes due to Rails dependency loading, improved gem spec with correct dependencies, and clean up on the html reporter.

Install it via the Github

gem install relevance-tarantula --source http://gems.github.com

or via Rails 2.1+ gem handing:

config.gem "relevance-tarantula", :source => "http://gems.github.com"


Scp or rsync failing with no error message? Check your startup scripts…

The other day I was having issues trying to scp/rsync data, with no real error message to try and debug things. Turns out that any output produced by your startup scripts will break rsync/scp hard. I had some simple ‘echo’ statements print when different scripts were being loaded…turns out scp/rysync don’t like that.

My capistrano task was a very simple call out to the ‘get’ helper, which just uses scp under the hood. The task ran and looked as if it completed, only nothing was ever transferred and the scp progress bar never came up. Sometimes it would block and do nothing, which was real fun, too.

The solution was simple - change all the bash scripts we use to not output any echo anything when running. I deployed the new scripts to all servers I needed to scp with, and the issue was resolved.

Since this is a known issue in the faq, it won’t be fixed or improved with a better error message. It’s just something you need to be aware of and work around, either via detecting if the session has an interactive terminal before sending output or removing your output statements altogether from you startup scripts.


Git Clone vs cp -R –> WTF?

I knew git was fast, and I even knew it was faster than a lot of plain linux local file operations. Still, this still blew me away:

CODE:
  1. rsanheim@ares:~/src/personal/oss $ du -hd 0 insoshi/
  2.  26M    insoshi/
  3.  
  4. rsanheim@ares:~/src/personal/oss $ time git clone insoshi/ /tmp/insoshi
  5. Initialize /tmp/insoshi/.git
  6. Initialized empty Git repository in /private/tmp/insoshi/.git/
  7. Checking out files: 100% (2193/2193), done.
  8.  
  9. real    0m3.826s
  10. user    0m0.251s
  11. sys 0m0.658s
  12.  
  13. rsanheim@ares:~/src/personal/oss $ time cp -R insoshi/ /tmp/insoshi_cp
  14.  
  15. real    0m9.065s
  16. user    0m0.114s
  17. sys 0m1.442s

Ok, so a 26 meg repo takes almost three times as long to copy via a recursive cp than a local git clone. Thats a fairly small repo, lets try something bigger:

CODE:
  1. rsanheim@ares:~/src/relevance $ du -hd 0 rails
  2.  75M    rails
  3.  
  4. rsanheim@ares:~/src/relevance $ time git clone rails /tmp/rails2
  5. Initialize /tmp/rails2/.git
  6. Initialized empty Git repository in /private/tmp/rails2/.git/
  7.  
  8. real    0m2.321s
  9. user    0m0.151s
  10. sys 0m0.465s
  11.  
  12. rsanheim@ares:~/src/relevance $ time cp -R rails/ /tmp/rails
  13.  
  14. real    0m7.133s
  15. user    0m0.067s
  16. sys 0m1.505s

The rails repo at 75 megs is still ~ 3 times faster.

Obviously, this is not scientific at all, but the point is pretty clear. Git is doing some magic that lets it move files around locally 2 to 3 times faster than a plain copy. From looking at the man page, I would guess it has something to do with git using hardlinks for things in .git/objects when cloning locally. My linux fu falls down a bit here -- what are the ramifications of using hard links versus doing a "real" copy?

(This also makes me want to try out gitbak even more...)


Quick: Find the Bug or Gotcha with named_scope

Think fast! Where's the bug?

RUBY:
  1. named_scope :active, :conditions => ["activated_at <= ?", DateTime.now.utc.to_s(:db)]

Looks fine, right? Maybe you've hit this already, and you see it immediately.

The symptoms are that the DateTime.now always seems to be a bit off - maybe you just restarted your server and its a only a few minutes off.

The bug is that DateTime.now gets evaluated at the time the class is loaded, not when the finder is run. What makes this easy to miss is that it will always work fine in tests and development, as everything is constantly getting reloaded there.

The fix, obvious once you've spent a combined time of over an hour trying to figure out what is going on:

RUBY:
  1. named_scope :active, lambda { { :conditions => ["activated_at <= ?", DateTime.now.utc.to_s(:db)] } }


Notes on testing Bj (Background Job)

Some thoughts and random notes on testing Bj within a Rails integration test (or spec).

  • You have to turn transactions off for the scope of the test, or suffer very confusing issues, since Bj itself wraps the job submittal within a transaction. The way I did this was just overriding the use_transactional_fixtures method in the one specific spec.

    RUBY:
    1. describe Foo
    2.   def self.use_transactional_fixtures
    3.     false
    4.   end

  • Remember, bj = background job. This may seem obvious, but whatever you submit to bj will be running in an entirely different process, so in our spec you need to wait for that job to complete before trying to assert things. You can do something as simple as this:
    RUBY:
    1. MAX_TIME = 10.0
    2.     seconds = 0.0
    3.     while(job.pending?) do
    4.       job.reload
    5.       seconds += 0.5
    6.       sleep 0.5
    7.       raise if seconds> MAX_TIME
    8.     end
    9. # normal assertions here

    This gives your job up to 10 seconds to finish, and will timeout if it takes too long, which usually means something has gone wrong.

  • You now have to watch multiple logs to figure out what is going on. So tail your test.log and tail the bj log as well, and run the script in isolation to make sure you understand where exceptions and syntax errors will go. I wasted some time scanning logs when I really need to check the job.stderr field that bj populates, so be sure to output that for common test failures.

Overall, I've been pleased with bj, besides some open questions I've still been working out by perusing the source. Check it out if you need a easy to use persistent job queue.


CapGun and LogBuddy updated to 0.0.5

Some long overdue releases of cap_gun and log_buddy - both have been updated to version 0.0.5. Both are now available as gems on github.com/relevance as well as from rubyforge.

CapGun gives you super simple deployment notifications from Capistrano. LogBuddy gives you a log helper through all objects, and can also log the name of the thing passed in along with its value -- saving you on typing and making debugging quicker.

CapGun got a fix so it does not attempt to display the rails_env if its not defined - this should clean up any strangeness in notifications if you saw something like "my_app was deployed to ".

LogBuddy got some minor tweaks and improved specs.

Both libraries now use Echoe, since Hoe complains about readme.txt when I want to use readme.rdoc, dammit. Both now only have a dev dependency on echoe to play nice with RubyGems 1.2.

You can install them via github or rubyforge:

sudo gem install log_buddy
sudo gem install cap_gun

or

gem sources -a http://gems.github.com
sudo gem install relevance-log_buddy
sudo gem install relevance-cap_gun

Please log bugs or issues at our Trac.


Git 1.5.6 released

Git 1.5.6 has been released, and there are a lot of usability fixes and tweaks which should make the upgrade worth your while. Looking at the detailed list of changes since 1.5.5, it looks like submodules have been getting quite a bit of love from many contributors, so it might be time to get them another shot. Scroll down or search in the announcement for the the part starting with "Changes since v1.5.5" and look through there for some of the submodule improvements that are coming.

The directions posted here worked fine for me to upgrade my existing source based installation in /usr/local.


Git lessons learned

Lessons learned from day to day use with various ruby and rails projects.

* Submodules completely suck when things get complex - I'm moving away from no submodules, and using direct exports for now until I have time to research braid or piston 2.0. For more details on this, see this or this post on the github group.

* Use capistrano 2.2, not 2.3! 2.3 breaks git support

* Always use :remote_cache for deployments -- super fast with git

* If you have weird errors, it probably means you need to pull - when in doubt pull to make sure you have the latest

* Branch more locally - I've been burned a few times when I've started work in master and then regretted it later when I wished my work wasn't in mainline (yes, its possible to fix this after the fact, but that gets into more advanced git usage)


Refactotum Rails Conf 2008

I'm in Portland for Rails Conf with over 80% of the Relevance crew. We were testing out our "plane number" yesterday, but thank goodness American didn't let us down.

We'll be speaking today at about how to contribute to open source at Refactotum from 1:30 to 5. We will cover some tools to help you find the code with the most technical debt, go over example refactorings, and then spend the rest of the session going from project to project and helping out as folks hit obstacles. Please bring a laptop with any projects checked out that you'd like to hack on during the session (git preferred but not necessary).

Hope to see you there!


← Before
Flickr View All » IMG_0143IMG_0141photo.jpgphoto.jpgphoto.jpgphoto.jpgphoto.jpgphoto.jpgphoto.jpg