I have seen websites that do not allow links to images generated by scripts. In order to bypass those restictions I have used the Apache rewrite engine. You could do that same, and make the browser think it is getting a normal png file.<br>
<br>A rule like this would do it:<br>RewriteRule ^glyph.([^.]*).png glyph.php?code=$1<br><br>Then change the img tags to read glyph.123.png<br><br>
<br><div class="gmail_quote">On Fri, Nov 14, 2008 at 6:14 PM, Stuart Thiessen <span dir="ltr"><<a href="mailto:thiessenstuart@aol.com" target="_blank">thiessenstuart@aol.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>Oh, ok. I haven't used these tools before, so I wasn't sure where the connection was. Thanks for explaining. :)<div><br></div><font color="#888888"><div>Stuart</div></font><div><div></div><div>
<div><br><div><div>On Nov 14, 2008, at 18:06 , Tony Bibbs wrote:</div><br><blockquote type="cite">A lot of the UI test tools automate doing things in the browser. I was suggesting that it or something like it (Bad Boy + Jmeter) might be able to do what you want.<br>
<br>--Tony<br><br><div class="gmail_quote">On Fri, Nov 14, 2008 at 5:56 PM, Stuart Thiessen <span dir="ltr"><<a href="mailto:thiessenstuart@aol.com" target="_blank">thiessenstuart@aol.com</a>></span> wrote:<br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
The img tag shows (for example) <img src="<a href="http://www.signbank.org/swis/glyph.php?code=45568" target="_blank">http://www.signbank.org/swis/glyph.php?code=45568</a>">. The glph script returns the appropriate image file for that code number. I have the dump of the html, but it is these <img> tags that are blocking me from having a complete offline copy. That was why I was also thinking of trying to automate some kind of PDF dump of each page.<br>
<br> Tony, I did look at the Selenium website, but I am not sure how it will help with this task. Could you explain?<br> <br> Thanks,<br> <br> Stuart<br> <br> On Nov 14, 2008, at 17:36 , Matthew Nuzum wrote:<br> <br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
On Fri, Nov 14, 2008 at 5:20 PM, Stuart Thiessen <<a href="mailto:thiessenstuart@aol.com" target="_blank">thiessenstuart@aol.com</a>> wrote:<br> <blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Anyway, my technical challenge is that the organization developing this<br> writing system has published a PHP database of the symbols at:<br> <a href="http://www.signbank.org/swis/data.php?subset=&bs_code=*" target="_blank">http://www.signbank.org/swis/data.php?subset=&bs_code=*</a><br>
<br> I need to get a offline dump of each of the basesymbol child pages listed on<br> that page. I can't do a simple download of the page as HTML because the<br> image file showing the symbol is actually a link to a script that finds the<br>
right symbol and plugs it in, so when I use programs like wget, a broken<br> link for the symbol image appears when I try to look at it offline.<br> </blockquote> <br> Does wget download the symbols and give them a funny name like<br>
glyph.php?code=368.html or glyph.php?code=368.png?<br> <br> I don't see anything sneaky on that page that would prevent you from<br> using an automated tool, my guess is that the names are getting<br> mangled. If so, view the source of your downloaded page and see what<br>
the filename is that it's expecting and how it differs from what was<br> actually generated. If they don't differ and the problem is really<br> that the name is illegal for the filesystem then you may be able to<br>
just use a script to rename the images and the paths to the images in<br> the html files.<br> <br> I've run into this problem before and just tried a different<br> downloader program. It's been a while since I've used one so I can't<br>
think of one to suggest at the moment.<br> <br> -- <br> Matthew Nuzum<br> newz2000 on freenode<br> _______________________________________________<br> Cialug mailing list<br> <a href="mailto:Cialug@cialug.org" target="_blank">Cialug@cialug.org</a><br>
<a href="http://cialug.org/mailman/listinfo/cialug" target="_blank">http://cialug.org/mailman/listinfo/cialug</a><br> </blockquote> <br> _______________________________________________<br> Cialug mailing list<br> <a href="mailto:Cialug@cialug.org" target="_blank">Cialug@cialug.org</a><br>
<a href="http://cialug.org/mailman/listinfo/cialug" target="_blank">http://cialug.org/mailman/listinfo/cialug</a><br> </blockquote></div><br> _______________________________________________<br>Cialug mailing list<br><a href="mailto:Cialug@cialug.org" target="_blank">Cialug@cialug.org</a><br>
<a href="http://cialug.org/mailman/listinfo/cialug" target="_blank">http://cialug.org/mailman/listinfo/cialug</a><br></blockquote></div><br></div></div></div></div>=
<br>_______________________________________________<br>
Cialug mailing list<br>
<a href="mailto:Cialug@cialug.org" target="_blank">Cialug@cialug.org</a><br>
<a href="http://cialug.org/mailman/listinfo/cialug" target="_blank">http://cialug.org/mailman/listinfo/cialug</a><br>
<br></blockquote></div><br>