A lot of the UI test tools automate doing things in the browser. I was suggesting that it or something like it (Bad Boy + Jmeter) might be able to do what you want.<br><br>--Tony<br><br><div class="gmail_quote">On Fri, Nov 14, 2008 at 5:56 PM, Stuart Thiessen <span dir="ltr"><<a href="mailto:thiessenstuart@aol.com">thiessenstuart@aol.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">The img tag shows (for example) <img src="<a href="http://www.signbank.org/swis/glyph.php?code=45568" target="_blank">http://www.signbank.org/swis/glyph.php?code=45568</a>">. The glph script returns the appropriate image file for that code number. I have the dump of the html, but it is these <img> tags that are blocking me from having a complete offline copy. That was why I was also thinking of trying to automate some kind of PDF dump of each page.<br>
<br>
Tony, I did look at the Selenium website, but I am not sure how it will help with this task. Could you explain?<br>
<br>
Thanks,<br>
<br>
Stuart<br>
<br>
On Nov 14, 2008, at 17:36 , Matthew Nuzum wrote:<br>
<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Fri, Nov 14, 2008 at 5:20 PM, Stuart Thiessen <<a href="mailto:thiessenstuart@aol.com" target="_blank">thiessenstuart@aol.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Anyway, my technical challenge is that the organization developing this<br>
writing system has published a PHP database of the symbols at:<br>
<a href="http://www.signbank.org/swis/data.php?subset=&bs_code=*" target="_blank">http://www.signbank.org/swis/data.php?subset=&bs_code=*</a><br>
<br>
I need to get a offline dump of each of the basesymbol child pages listed on<br>
that page. I can't do a simple download of the page as HTML because the<br>
image file showing the symbol is actually a link to a script that finds the<br>
right symbol and plugs it in, so when I use programs like wget, a broken<br>
link for the symbol image appears when I try to look at it offline.<br>
</blockquote>
<br>
Does wget download the symbols and give them a funny name like<br>
glyph.php?code=368.html or glyph.php?code=368.png?<br>
<br>
I don't see anything sneaky on that page that would prevent you from<br>
using an automated tool, my guess is that the names are getting<br>
mangled. If so, view the source of your downloaded page and see what<br>
the filename is that it's expecting and how it differs from what was<br>
actually generated. If they don't differ and the problem is really<br>
that the name is illegal for the filesystem then you may be able to<br>
just use a script to rename the images and the paths to the images in<br>
the html files.<br>
<br>
I've run into this problem before and just tried a different<br>
downloader program. It's been a while since I've used one so I can't<br>
think of one to suggest at the moment.<br>
<br>
-- <br>
Matthew Nuzum<br>
newz2000 on freenode<br>
_______________________________________________<br>
Cialug mailing list<br>
<a href="mailto:Cialug@cialug.org" target="_blank">Cialug@cialug.org</a><br>
<a href="http://cialug.org/mailman/listinfo/cialug" target="_blank">http://cialug.org/mailman/listinfo/cialug</a><br>
</blockquote>
<br>
_______________________________________________<br>
Cialug mailing list<br>
<a href="mailto:Cialug@cialug.org" target="_blank">Cialug@cialug.org</a><br>
<a href="http://cialug.org/mailman/listinfo/cialug" target="_blank">http://cialug.org/mailman/listinfo/cialug</a><br>
</blockquote></div><br>