Read One liner Linux command to extract URLs from text files in Ehab Heikal’s blog
I find my self needing to extract URLs from text files quite a lot and this is the easiest one liner linux command line magic that I got to extract urls from text files.
cat filename | grep http | grep -shoP 'http.*?[" >]' > outfilename The first grep helps reduce cpu load. The second grep uses perl grep syntax to enable non-greedy grepping and thus allow you to get multiple URLs in one line of HTML and allows you to get the closest extraction. With the above you will still get a trailing quote in the end most of the time, this you can easily delete using your favorite text editor by simply replacing all instances of a quote with nothing. Simple and short and works well.