Tuesday, 26 February 2008 @ 10:38am • computadora, mechanize
I thought I would share the part of my twitterbrite scripts that uploads videos to Youtube. Its about 30 lines long, and took me an hour or so to write. Most of my time was spent figuring out form fields to fill out rather than writing code though....
I've broken the script down to three parts: logging in, setting the video attributes, and uploading the video file.
Step 1: Logging In
The first step is pretty simple. Just instantiate a new mechanize object, fetch youtube.com, set your login credentials, and submit!
agent = WWW::Mechanize.
new { |a|
a.
user_agent_alias = 'Mac Safari'
}
page = agent.
get('http://youtube.
com/'
)
# Login
page.form('loginForm') { |f|
f.username = 'username'
f.password = 'password'
}.submit
Step 2: Setting video attributes
This is probably the most difficult step. Now that the agent is logged in, we have to fetch the upload page and fill out the video attributes form. You have to set the title, description, category, and keywords for your video. The you have to tell the agent to click a special button.
# Set the video attributes
page = agent.get('http://youtube.com/my_videos_upload')
form = page.form('theForm')
form.field_myvideo_title = 'My video title'
form.field_myvideo_descr = "My video description"
form.field_myvideo_categories = 28
form.field_myvideo_keywords = 'my tag'
page = form.submit(form.buttons.name('action_upload').first)
The number "28" is just the value from the category drop down list. You can iterate over the select options using mechanize, but I leave that as an exercise to the reader.
Step 3: Upload the video file
My script expects that the video file name will be supplied on the command line, so ARGV[0] should point to the file you want to upload. In this step, you simply set the video file name, then submit the form.
# Upload the video
page = page.form('theForm') { |f|
f.file_uploads.name('field_uploadfile').first.file_name = ARGV[0]
}.submit
page.body =~ /<textarea[^>]*>(.*)<\/textarea>/m
puts $1
The last two lines grab the html needed to display the video and prints it.
There you go. Upload lots of videos now! Yay!
Posted by Aaron Patterson •
Permalink •
Comments (0) •
Leave your Comment »
Tuesday, 26 February 2008 @ 12:12am • betabrite, computadora, mechanize
If this doesn't win me the super nerd of the year award, I don't know what will. In fact, this is so nerdy that I'm kind of ashamed to write about it!
Remeber my Twitterbrite post? Well I registered twitterbrite.com, and now if my client catches a twit from you, it records the message on the LED sign and uploads it to my youtube account. Go ahead, check out twitterbrite.com now!
Update: My scripts are too chatty, so now it will only post videos if the text of the twit contains 'betabrite'.
Posted by Aaron Patterson •
Permalink •
Comments (3) •
Leave your Comment »
Thursday, 3 January 2008 @ 11:31am • computadora, ecma, mechanize, rkelly
I've just started getting the runtime working with RKelly. Its working well enough at this point that I was able to execute my earlier Fibonacci example. I've added a method to the runtime that allows you to define ruby functions that may be called from inside javascript. For example, the alert function in the following example is defined in ruby and delegates to puts.
runtime = RKelly::Runtime.
new
runtime.define_function(:alert) do |*args|
puts(*args)
end
runtime.execute(<<END
function f(n) {
var s = 0;
if(n == 0) return(s);
if(n == 1) {
s += 1;
return(s);
} else {
return(f(n - 1) + f(n - 2));
}
}
alert(f(20));
END
)
Here is the execution time with ruby 1.8.6 on my machine:
[aaron@mac-mini rkelly]$ time ~/.multiruby/install/1.8.6-p111/bin/ruby -I lib test.rb
6765
real 0m54.332s
user 0m53.913s
sys 0m0.336s
[aaron@mac-mini rkelly]$
Same code, same machine, but with ruby 1.9.0:
[aaron@mac-mini rkelly]$ time ~/.multiruby/install/1.9.0-0/bin/ruby -I lib test.rb
6765
real 0m20.863s
user 0m20.678s
sys 0m0.142s
[aaron@mac-mini rkelly]$
I need to get loops working next!
Posted by Aaron Patterson •
Permalink •
Comments (3) •
Leave your Comment »
Sunday, 9 December 2007 @ 10:56pm • computadora, mechanize
I've been refactoring Mechanize for an 0.7.0 release. Basically I'm trying to clean the code up and there are a few features that I think are unnecessary, but I would like to ask people first.
- REXML as a parser.
I want to remove support for REXML. I don't use it. Hpricot seems to do everything I need.
- 1.8.2 thru 1.8.4 support
I've got a bunch of monkey patches for 1.8.2 thru 1.8.4. I'd like to remove these because I think most people are on 1.8.5 or up.
- WWW::Mechanize::Page#watch_for_set
I am going to remove this method. It made sense when REXML was the main parser, since REXML was so slow. I think that Hpricot is fast enough that this method is not so useful.
I'm going to make 0.7.0 lazily build up form and link objects, which should give everyone a slight speed increase but makes watchforset obsolete (sort of).
I'm changing around the class names to be better organized, but they should all have the same methods. Also, if there are any feature requests, let me know!
Posted by Aaron Patterson •
Permalink •
Comments (4) •
Leave your Comment »
Thursday, 15 March 2007 @ 6:11pm • mechanize
The Mechanize library is used for automating interaction with websites.
Mechanize automatically stores and sends cookies, follows redirects,
can follow links, and submit forms. Form fields can be populated and
submitted. Mechanize also keeps track of the sites that you have visited as
a history.
Changes:
Mechanize CHANGELOG
0.6.5
- Copying headers to a hash to prevent memory leaks
- Speeding up page parsing
- Aliased fields to elements
- Adding If-Modified-Since header
- Added delete_field! to form. Thanks to Sava Chankov
- Updated uri escaping to support high order characters. Thanks to Henrik Nyh.
- Better handling relative URIs. Thanks to Henrik Nyh
- Now handles pipes in URLs
http://rubyforge.org/tracker/?func=detail&aid=7140&group_id=1453&atid=5709
- Now escaping html entities in form fields.
http://rubyforge.org/tracker/?func=detail&aid=7563&group_id=1453&atid=5709
- Added MSIE 7.0 user agent string
Posted by Aaron Patterson •
Permalink •
Comments (2) •
Leave your Comment »
Monday, 26 February 2007 @ 7:48pm • computadora, mechanize
mechanize version 0.6.5 has been released!
http://mechanize.rubyforge.org/
The Mechanize library is used for automating interaction with websites.
Mechanize automatically stores and sends cookies, follows redirects,
can follow links, and submit forms. Form fields can be populated and
submitted. Mechanize also keeps track of the sites that you have visited as
a history.
Changes:
Mechanize CHANGELOG
0.6.5
Posted by Aaron Patterson •
Permalink •
Comments (0) •
Leave your Comment »
Saturday, 13 January 2007 @ 9:01pm • computadora, mechanize
I was debugging Mechanize the other day, and thought it would be handy to have a graph of objects in memory and they're relationship with each other. So I put together a simple script that outputs a Graphviz file illustrating what the object points to. Here's the code:
require 'ograph'
require 'rubygems'
require 'mechanize'
mech = WWW::Mechanize.new
mech.get('http://google.com/')
puts ObjectGraph.graph(mech, /^WWW/)
and the output:

Right now it only supports Arrays, but should be easily extensible to support all other Enumerable types. Here's the code for ObjectGraph:
class ObjectGraph
def self.
graph(target, class_name = /./
)
stack =
[target
]
object_links =
[]
seen_objects =
[]
seen_hash =
{}
while stack.length > 0
object = stack.pop
next if seen_hash.key? object.object_id
seen_hash[object.object_id] = 1
if object.is_a?(Enumerable) && ! object.is_a?(String)
object.each { |iv|
if iv.class.to_s =~ class_name || object.is_a?(Enumerable)
object_links.push([object.object_id, iv.object_id])
stack.push(iv)
end
}
else
object.instance_variables.each do |iv_sym|
iv = object.instance_variable_get iv_sym
if iv.class.to_s =~ class_name || iv.is_a?(Enumerable)
object_links.push([object.object_id, iv.object_id])
stack.push(iv)
end
end
end
seen_objects.push([object.object_id, object.class])
end
s = <<END
digraph g {
graph [ rankdir = "LR" ];
node [ fontsize = "8"
shape = "ellipse"
];
edge [ ];
END
seen_objects.each { |id, klass|
s += <<END
"#{id}" [
label = "<f0> #{id}|#{klass}"
shape = "record"
]
END
}
object_links.each_with_index { |(from, to), i|
s += "\"#{from}\":f0 -> \"#{to}\":f0 [ id = #{i} ]\n"
}
s += "}\n"
s
end
end
Update: added Enumerable support, so Hashes are now graphed.
Posted by Aaron Patterson •
Permalink •
Comments (10) •
Leave your Comment »
Saturday, 23 September 2006 @ 7:10pm • computadora, mechanize
I thought that while I was traveling in Europe I would be able to work on WWW::Mechanize, but then I found out that I can't do anything when I'm travling. Traveling is too much work for me to do anything productive. Luckily I took a day off after my travels, and I was able to complete my work on Mechanize only a couple days late. I think Mechanize is getting more popular, judging from the number of bug submitted. I've fixed more bugs this release than any other. Check out the CHANGELOG for more details on the bug fixes.
As far as new features are concerned, I've added a submit method to forms. Before, you had to call submit on the mechanize object and pass in the form like this:
Now you can just call the submit method on the form like this:
Along the same line, I've added a click method to links. Instead of passing a link to the agent object, you can just call the click method on the link like so:
I wonder when people will figure out where the release names come from. Nobody has said anything about them to me..... So far I have:
- Twan
- Sylvester
- Rufus
- Chuck
Posted by Aaron Patterson •
Permalink •
Comments (8) •
Leave your Comment »
Monday, 4 September 2006 @ 2:48pm • computadora, mechanize
I've been working pretty hard on merging Mechanize with Hpricot in to what I call Mechpricot, or Hprichanize. Things are going quite well.
I've totally cleaned up the Cookie code to use more cookie code from WEBrick, and in the process I found a bug in the way that WEBrick parses set-cookie headers. Unfortunately no one has replied to my patch on ruby-talk, so I think I'll try ruby-core next.....
I removed the dependency on mime-types in favor of using code in WEBrick. I'd like to have as few dependencies as possible. I would like to make a few more performance tweaks before I release. I'll see if I can get this completely done before this weekend. Otherwise everyone will have to wait until I'm back from Spain!
Posted by Aaron Patterson •
Permalink •
Comments (1) •
Leave your Comment »
Thursday, 10 August 2006 @ 1:12pm • computadora, life, mechanize
I was able to write a pretty quick mechanize script that grabbed the playlist from kexp and put it in a SQLite database. From that, I came up with the top 4 songs played on KEXP for July:
- Tv On The Radio - Playhouses
- Midlake - Roscoe
- Johnny Cash - Gods Gonna Cut You Down
- Silversun Pickups - Lazy Eye
I also used Gruff Graphs to make a graph:

Posted by Aaron Patterson •
Permalink •
Comments (2) •
Leave your Comment »