The reason I brought up the topic of
extracting URLs from a file stems from my need to extract a particular string. I wanted to grab screencasts from a site so I could review them offline. I'm presuming it's legal because they're available in MP4 format. Unfortunately, the page itself didn't provide any links to the files as they were all embedded in the call made by the Flash player. They all lay in the source, though, but I would have to wade through hundreds of lines to get at them.
The links all came in the format:
http://blahblahblah.edu/media/common/media/ef8c746d-bc0a-4820-922b-1d654ad15189_opencast_screen_alldist.mp4
After fiddling around a bit with Perl and the regular expression string, I finally stumbled upon what I was looking for:
cat course_details_new.php.html |
perl -n -e 'chomp;s/.*(?:(?i)file=)(.*opencast_screen_alldist.mp4).*/$1\n/xg and print'