I thought that while I was traveling in Europe I would be able to work on WWW::Mechanize, but then I found out that I can’t do anything when I’m travling. Traveling is too much work for me to do anything productive. Luckily I took a day off after my travels, and I was able to complete my work on Mechanize only a couple days late. I think Mechanize is getting more popular, judging from the number of bug submitted. I’ve fixed more bugs this release than any other. Check out the CHANGELOG for more details on the bug fixes.
As far as new features are concerned, I’ve added a submit method to forms. Before, you had to call submit on the mechanize object and pass in the form like this:
page = agent.submit form
Now you can just call the submit method on the form like this:
page = form.submit
Along the same line, I’ve added a click method to links. Instead of passing a link to the agent object, you can just call the click method on the link like so:
page = link.click
I wonder when people will figure out where the release names come from. Nobody has said anything about them to me….. So far I have:
- Twan
- Sylvester
- Rufus
- Chuck
Mechanize seems trapped in the closet.
I’m glad someone gets it.
Hey,
Has anyone successfully logged in to Netflix using Mechanize? I am
pretty new at this, but I tried to follow the “guides” instructions.
However it seems that the login page for netflix uses Javascript to
process its form, and so isn’t parsed correctly by Mechanize – or at
least must be handled differently than those examples covered in the
guide.
The login url for Netflix is http://www.netflix.com/Login? (Actually
there seem to be many possible urls for this. I’m not sure what they
are using the urls to track)
The trouble is that when I pretty print what Mechanize fetches, while
there is a form called login, it doesn’t have any login fields to
speak of. Instead the submit button fires off a javascript which (I
assume) reads in two other text boxes on the page “password” and
“email”. One can see these fields in the source. However they don’t
seem to lie within form. I don’t know how to set these other fields
values, because at least in the examples Mechanize only modifies
form.field not just any old field hanging out in the wild.
Joe
I have the same issue as Joseph Coffey – I’m trying to access a login form that clearly has input fields for username and password. But the only fields mechanize sees are the hidden fields.
Like Joseph’s, my page is full of javascript that manipulates page elements. But when you strip it all away, it’s just plain:
form
input type = “text” name= “user”
input type = “password” name = “password”
input type= “submit” name=”login”
input type=”hidden” value=”whatever”
input type= “hidden” value=”whatever2″
/form
…but when I pp the form in mechanize, it only sees the hidden ones.
I could probably supply the source for the page if it would help, but I’d have to clean it up because it’s something internal.
Thanks
Pat
Hey Pat. When I worked with Joe on his problem, I found that the source for the page contained malformed HTML. Can you send the HTML to me, or the mailing list? Then I can take a look and tell you what the problem is.
If you don’t want to send in the original, it would be helpful if you could trim the html down to the smallest example that reproduces the problem.
I try to log in to a page via a post-from.
My code looks like this:
agent = WWW::Mechanize.new
agent.useragentalias = “Windows IE 6″ # tried other agents as well
page = agent.post(URI.parse(‘http://www.nzb.to/’))
link = page.links.text(“einloggencome in”)
page = agent.click(link)
form = page.forms[1]
form["username"]= “energydrink”
form["pass"] = “energydrink”
puts agent.cookies.size
puts “duppidoo” if page.body =~ /eingeloggt .* energydrink/
I tried it like this on an other page, an it works fine.
But on this page…i tried but i it over and over but it doesn’t seem to work.
On this page exists 3 forms with all the same name (called “form1″). Could this be a problem?
Has anyone an idea that could help?
Thanks in advance
Pete
I am trying to use www::mechanize with Aptana radrails for windows to scrape a website, but Anytime I run the line :
page = agent.get(url) I get an error :
undefined method `inner_text’ for #
Is anyone familiar with the solution to this problem ?
I have the following application environment :
Ruby version 1.8.6 (i386-mswin32)
RubyGems version 0.9.2
Rails version 1.2.3
Thanks,
HM
What version of hpricot do you have installed? What version of mechanize?